Sample records for multiple source genes

  1. Ensemble positive unlabeled learning for disease gene identification.

    PubMed

    Yang, Peng; Li, Xiaoli; Chua, Hon-Nian; Kwoh, Chee-Keong; Ng, See-Kiong

    2014-01-01

    An increasing number of genes have been experimentally confirmed in recent years as causative genes to various human diseases. The newly available knowledge can be exploited by machine learning methods to discover additional unknown genes that are likely to be associated with diseases. In particular, positive unlabeled learning (PU learning) methods, which require only a positive training set P (confirmed disease genes) and an unlabeled set U (the unknown candidate genes) instead of a negative training set N, have been shown to be effective in uncovering new disease genes in the current scenario. Using only a single source of data for prediction can be susceptible to bias due to incompleteness and noise in the genomic data and a single machine learning predictor prone to bias caused by inherent limitations of individual methods. In this paper, we propose an effective PU learning framework that integrates multiple biological data sources and an ensemble of powerful machine learning classifiers for disease gene identification. Our proposed method integrates data from multiple biological sources for training PU learning classifiers. A novel ensemble-based PU learning method EPU is then used to integrate multiple PU learning classifiers to achieve accurate and robust disease gene predictions. Our evaluation experiments across six disease groups showed that EPU achieved significantly better results compared with various state-of-the-art prediction methods as well as ensemble learning classifiers. Through integrating multiple biological data sources for training and the outputs of an ensemble of PU learning classifiers for prediction, we are able to minimize the potential bias and errors in individual data sources and machine learning algorithms to achieve more accurate and robust disease gene predictions. In the future, our EPU method provides an effective framework to integrate the additional biological and computational resources for better disease gene predictions.

  2. Origins of extrinsic variability in eukaryotic gene expression

    NASA Astrophysics Data System (ADS)

    Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff

    2006-02-01

    Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes simultaneously, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modelling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous lower limit for expression variability. A second source, which is modelled as originating from a common upstream transcription factor, exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.

  3. Origins of extrinsic variability in eukaryotic gene expression

    NASA Astrophysics Data System (ADS)

    Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff

    2006-03-01

    Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes in concert, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modeling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous noise floor in expression variability. A second source which is modeled as originating from a common upstream transcription factor exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.

  4. In Silico Gene Prioritization by Integrating Multiple Data Sources

    PubMed Central

    Zhou, Yingyao; Shields, Robert; Chanda, Sumit K.; Elston, Robert C.; Li, Jing

    2011-01-01

    Identifying disease genes is crucial to the understanding of disease pathogenesis, and to the improvement of disease diagnosis and treatment. In recent years, many researchers have proposed approaches to prioritize candidate genes by considering the relationship of candidate genes and existing known disease genes, reflected in other data sources. In this paper, we propose an expandable framework for gene prioritization that can integrate multiple heterogeneous data sources by taking advantage of a unified graphic representation. Gene-gene relationships and gene-disease relationships are then defined based on the overall topology of each network using a diffusion kernel measure. These relationship measures are in turn normalized to derive an overall measure across all networks, which is utilized to rank all candidate genes. Based on the informativeness of available data sources with respect to each specific disease, we also propose an adaptive threshold score to select a small subset of candidate genes for further validation studies. We performed large scale cross-validation analysis on 110 disease families using three data sources. Results have shown that our approach consistently outperforms other two state of the art programs. A case study using Parkinson disease (PD) has identified four candidate genes (UBB, SEPT5, GPR37 and TH) that ranked higher than our adaptive threshold, all of which are involved in the PD pathway. In particular, a very recent study has observed a deletion of TH in a patient with PD, which supports the importance of the TH gene in PD pathogenesis. A web tool has been implemented to assist scientists in their genetic studies. PMID:21731658

  5. EnRICH: Extraction and Ranking using Integration and Criteria Heuristics.

    PubMed

    Zhang, Xia; Greenlee, M Heather West; Serb, Jeanne M

    2013-01-15

    High throughput screening technologies enable biologists to generate candidate genes at a rate that, due to time and cost constraints, cannot be studied by experimental approaches in the laboratory. Thus, it has become increasingly important to prioritize candidate genes for experiments. To accomplish this, researchers need to apply selection requirements based on their knowledge, which necessitates qualitative integration of heterogeneous data sources and filtration using multiple criteria. A similar approach can also be applied to putative candidate gene relationships. While automation can assist in this routine and imperative procedure, flexibility of data sources and criteria must not be sacrificed. A tool that can optimize the trade-off between automation and flexibility to simultaneously filter and qualitatively integrate data is needed to prioritize candidate genes and generate composite networks from heterogeneous data sources. We developed the java application, EnRICH (Extraction and Ranking using Integration and Criteria Heuristics), in order to alleviate this need. Here we present a case study in which we used EnRICH to integrate and filter multiple candidate gene lists in order to identify potential retinal disease genes. As a result of this procedure, a candidate pool of several hundred genes was narrowed down to five candidate genes, of which four are confirmed retinal disease genes and one is associated with a retinal disease state. We developed a platform-independent tool that is able to qualitatively integrate multiple heterogeneous datasets and use different selection criteria to filter each of them, provided the datasets are tables that have distinct identifiers (required) and attributes (optional). With the flexibility to specify data sources and filtering criteria, EnRICH automatically prioritizes candidate genes or gene relationships for biologists based on their specific requirements. Here, we also demonstrate that this tool can be effectively and easily used to apply highly specific user-defined criteria and can efficiently identify high quality candidate genes from relatively sparse datasets.

  6. Effect of Aggregation Operators on Network-Based Disease Gene Prioritization: A Case Study on Blood Disorders.

    PubMed

    Grewal, Nivit; Singh, Shailendra; Chand, Trilok

    2017-01-01

    Owing to the innate noise in the biological data sources, a single source or a single measure do not suffice for an effective disease gene prioritization. So, the integration of multiple data sources or aggregation of multiple measures is the need of the hour. The aggregation operators combine multiple related data values to a single value such that the combined value has the effect of all the individual values. In this paper, an attempt has been made for applying the fuzzy aggregation on the network-based disease gene prioritization and investigate its effect under noise conditions. This study has been conducted for a set of 15 blood disorders by fusing four different network measures, computed from the protein interaction network, using a selected set of aggregation operators and ranking the genes on the basis of the aggregated value. The aggregation operator-based rankings have been compared with the "Random walk with restart" gene prioritization method. The impact of noise has also been investigated by adding varying proportions of noise to the seed set. The results reveal that for all the selected blood disorders, the Mean of Maximal operator has relatively outperformed the other aggregation operators for noisy as well as non-noisy data.

  7. RANGER-DTL 2.0: Rigorous Reconstruction of Gene-Family Evolution by Duplication, Transfer, and Loss.

    PubMed

    Bansal, Mukul S; Kellis, Manolis; Kordi, Misagh; Kundu, Soumya

    2018-04-24

    RANGER-DTL 2.0 is a software program for inferring gene family evolution using Duplication-Transfer-Loss reconciliation. This new software is highly scalable and easy to use, and offers many new features not currently available in any other reconciliation program. RANGER-DTL 2.0 has a particular focus on reconciliation accuracy and can account for many sources of reconciliation uncertainty including uncertain gene tree rooting, gene tree topological uncertainty, multiple optimal reconciliations, and alternative event cost assignments. RANGER-DTL 2.0 is open-source and written in C ++ and Python. Pre-compiled executables, source code (open-source under GNU GPL), and a detailed manual are freely available from http://compbio.engr.uconn.edu/software/RANGER-DTL/. mukul.bansal@uconn.edu.

  8. shinyGISPA: A web application for characterizing phenotype by gene sets using multiple omics data combinations.

    PubMed

    Dwivedi, Bhakti; Kowalski, Jeanne

    2018-01-01

    While many methods exist for integrating multi-omics data or defining gene sets, there is no one single tool that defines gene sets based on merging of multiple omics data sets. We present shinyGISPA, an open-source application with a user-friendly web-based interface to define genes according to their similarity in several molecular changes that are driving a disease phenotype. This tool was developed to help facilitate the usability of a previously published method, Gene Integrated Set Profile Analysis (GISPA), among researchers with limited computer-programming skills. The GISPA method allows the identification of multiple gene sets that may play a role in the characterization, clinical application, or functional relevance of a disease phenotype. The tool provides an automated workflow that is highly scalable and adaptable to applications that go beyond genomic data merging analysis. It is available at http://shinygispa.winship.emory.edu/shinyGISPA/.

  9. shinyGISPA: A web application for characterizing phenotype by gene sets using multiple omics data combinations

    PubMed Central

    Dwivedi, Bhakti

    2018-01-01

    While many methods exist for integrating multi-omics data or defining gene sets, there is no one single tool that defines gene sets based on merging of multiple omics data sets. We present shinyGISPA, an open-source application with a user-friendly web-based interface to define genes according to their similarity in several molecular changes that are driving a disease phenotype. This tool was developed to help facilitate the usability of a previously published method, Gene Integrated Set Profile Analysis (GISPA), among researchers with limited computer-programming skills. The GISPA method allows the identification of multiple gene sets that may play a role in the characterization, clinical application, or functional relevance of a disease phenotype. The tool provides an automated workflow that is highly scalable and adaptable to applications that go beyond genomic data merging analysis. It is available at http://shinygispa.winship.emory.edu/shinyGISPA/. PMID:29415010

  10. ISOLATED FROM CLINICAL AND ENVIRONMENTAL SOURCES IN NORTHEAST THAILAND.

    PubMed

    Mala, Wanida; Kaewkes, Wanlop; Tattawasart, Unchalee; Wongwajana, Suwin; Faksri, Kiatichai; Chomvarin, Chariya

    2016-09-01

    Emergence of multiple drug resistance in Vibrio cholerae has been increasing around the world including Northeast Thailand. In this study, 92 isolates of V. cholerae (50 O1 and 42 non-O1/non-O139 isolates) from clinical and environmental sources in Northeast Thailand were randomly selected and investigated for the presence of SXT element, class 1 integron and antimicrobial resistance genes. Genotypic-phenotypic concordance of antimicrobial resistance was also determined. Using PCR-based assays, 79% of V. cholerae isolates were positive for SXT element, whereas only 1% was positive for class 1 integron. SXT element harbored antimicrobial resistance genes, dfrA1 or dfr18, floR, strB, sul2, and tetA. Overall phenotypic-genotypic concordance of antimicrobial resistance was 78%, with highest and lowest value being for trimethoprim (83%) and chloramphenicol (70%), respectively. Ninety-two percent of V. cholerae O1 strains isolated from clinical sources harbored both dfrA1 (O1-specific trimethoprim resistance gene) and dfr18 (non-O1-specific trimethoprim resistance gene), whereas only 5% of V. cholerae non-O1/non-O139 strains harbored both genes. All V. cholerae O1 isolated from environmental source harbored dfr18 but 48% of V. cholerae non-O1/non-O139 harbored dfrA1. This study indicates that SXT element was the main contributor to the circulation of multiple-drug resistance determinants in V. cholerae strains in Northeast Thailand and that genetic exchange of SXT element can occur in both V. cholerae O1 and non-O1/non-O139 strains from clinical and environmental sources.

  11. iGC-an integrated analysis package of gene expression and copy number alteration.

    PubMed

    Lai, Yi-Pin; Wang, Liang-Bo; Wang, Wei-An; Lai, Liang-Chuan; Tsai, Mong-Hsun; Lu, Tzu-Pin; Chuang, Eric Y

    2017-01-14

    With the advancement in high-throughput technologies, researchers can simultaneously investigate gene expression and copy number alteration (CNA) data from individual patients at a lower cost. Traditional analysis methods analyze each type of data individually and integrate their results using Venn diagrams. Challenges arise, however, when the results are irreproducible and inconsistent across multiple platforms. To address these issues, one possible approach is to concurrently analyze both gene expression profiling and CNAs in the same individual. We have developed an open-source R/Bioconductor package (iGC). Multiple input formats are supported and users can define their own criteria for identifying differentially expressed genes driven by CNAs. The analysis of two real microarray datasets demonstrated that the CNA-driven genes identified by the iGC package showed significantly higher Pearson correlation coefficients with their gene expression levels and copy numbers than those genes located in a genomic region with CNA. Compared with the Venn diagram approach, the iGC package showed better performance. The iGC package is effective and useful for identifying CNA-driven genes. By simultaneously considering both comparative genomic and transcriptomic data, it can provide better understanding of biological and medical questions. The iGC package's source code and manual are freely available at https://www.bioconductor.org/packages/release/bioc/html/iGC.html .

  12. Endeavour update: a web resource for gene prioritization in multiple species

    PubMed Central

    Tranchevent, Léon-Charles; Barriot, Roland; Yu, Shi; Van Vooren, Steven; Van Loo, Peter; Coessens, Bert; De Moor, Bart; Aerts, Stein; Moreau, Yves

    2008-01-01

    Endeavour (http://www.esat.kuleuven.be/endeavourweb; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes. Using a training set of genes known to be involved in a biological process of interest, our approach consists of (i) inferring several models (based on various genomic data sources), (ii) applying each model to the candidate genes to rank those candidates against the profile of the known genes and (iii) merging the several rankings into a global ranking of the candidate genes. In the present article, we describe the latest developments of Endeavour. First, we provide a web-based user interface, besides our Java client, to make Endeavour more universally accessible. Second, we support multiple species: in addition to Homo sapiens, we now provide gene prioritization for three major model organisms: Mus musculus, Rattus norvegicus and Caenorhabditis elegans. Third, Endeavour makes use of additional data sources and is now including numerous databases: ontologies and annotations, protein–protein interactions, cis-regulatory information, gene expression data sets, sequence information and text-mining data. We tested the novel version of Endeavour on 32 recent disease gene associations from the literature. Additionally, we describe a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis. PMID:18508807

  13. SOURCES OF VARIATION IN BASELINE GENE EXPRESSION LEVELS FROM TOXICOGENOMIC STUDY CONTROL ANIMALS ACROSS MULTIPLE LABORATORIES

    EPA Science Inventory

    Variations in study design are typical for toxicogenomic studies, but their impact on gene expression in control animals has not been well characterized. A dataset of control animal microarray expression data was assembled by a working group of the Health and Environmental Scienc...

  14. Multiple Antibiotic Resistance Gene Transfer from Animal to Human Enterococci in the Digestive Tract of Gnotobiotic Mice

    PubMed Central

    Moubareck, C.; Bourgeois, N.; Courvalin, P.; Doucet-Populaire, F.

    2003-01-01

    It has been proposed that food animals represent the source of glycopeptide resistance genes present in enterococci from humans. We demonstrated the transfer of vanA and of other resistance genes from porcine to human Enterococcus faecium at high frequency in the digestive tract of gnotobiotic mice. Tylosin in the drinking water favored colonization by transconjugants. PMID:12937011

  15. The Natural History of Class I Primate Alcohol Dehydrogenases Includes Gene Duplication, Gene Loss, and Gene Conversion

    PubMed Central

    Carrigan, Matthew A.; Uryasev, Oleg; Davis, Ross P.; Zhai, LanMin; Hurley, Thomas D.; Benner, Steven A.

    2012-01-01

    Background Gene duplication is a source of molecular innovation throughout evolution. However, even with massive amounts of genome sequence data, correlating gene duplication with speciation and other events in natural history can be difficult. This is especially true in its most interesting cases, where rapid and multiple duplications are likely to reflect adaptation to rapidly changing environments and life styles. This may be so for Class I of alcohol dehydrogenases (ADH1s), where multiple duplications occurred in primate lineages in Old and New World monkeys (OWMs and NWMs) and hominoids. Methodology/Principal Findings To build a preferred model for the natural history of ADH1s, we determined the sequences of nine new ADH1 genes, finding for the first time multiple paralogs in various prosimians (lemurs, strepsirhines). Database mining then identified novel ADH1 paralogs in both macaque (an OWM) and marmoset (a NWM). These were used with the previously identified human paralogs to resolve controversies relating to dates of duplication and gene conversion in the ADH1 family. Central to these controversies are differences in the topologies of trees generated from exonic (coding) sequences and intronic sequences. Conclusions/Significance We provide evidence that gene conversions are the primary source of difference, using molecular clock dating of duplications and analyses of microinsertions and deletions (micro-indels). The tree topology inferred from intron sequences appear to more correctly represent the natural history of ADH1s, with the ADH1 paralogs in platyrrhines (NWMs) and catarrhines (OWMs and hominoids) having arisen by duplications shortly predating the divergence of OWMs and NWMs. We also conclude that paralogs in lemurs arose independently. Finally, we identify errors in database interpretation as the source of controversies concerning gene conversion. These analyses provide a model for the natural history of ADH1s that posits four ADH1 paralogs in the ancestor of Catarrhine and Platyrrhine primates, followed by the loss of an ADH1 paralog in the human lineage. PMID:22859968

  16. EVIDENCE FOR LANDSCAPE LEVEL, POLLEN-MEDIATED GENE FLOW FROM CREEPING BENTGRASS WITH CP4 EPSPS AS A MARKER

    EPA Science Inventory

    In a landscape level study, gene flow via pollen was tracked from multiple source fields of genetically modified (GM) herbicide resistant creeping bentgrass (Agrostis stolonifera L.) to 75 of 138 sentinel plants of A. stolonifera and to 29 of 69 resident populations of Agrostis s...

  17. Female pseudohermaphroditism with multiple caudal anomalies: Absence of Y-specific DNA sequences as pathogenetic factors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seaver, L.H.; Grimes, J.; Erickson, R.P.

    1994-05-15

    46,XX female pseudohermaphrodites have been previously described with nearly complete masculinization of the external genitalia and no apparent source of testosterone. Multiple malformations of internal genital, urinary, and gastrointestinal tracts are associated. We have evaluated four such infants with female pseudohermaphroditism and multiple caudal anomalies. Three cases had apparently normal chromosome (46,XX); one had a 46,XX,del(10)(q25.3{yields}qter) chromosome constitution. The chromosome breakpoint is in the region of PAX2, a developmentally important paired box gene which is expressed in urogenital tissue. Using the polymerase chain reaction, we screened for the presence of multiple Y specific sequences, including SRY (sex determining region, Ymore » chromosome), that could explain masculinization of the external genitalia. All were negative for Y centromeric sequences, ZFY (Zinc finger Y), and SRY. Furthermore, there was no evidence for adrenal or other sources of testosterone. We suggest that the masculinization in these cases is the result of abnormal expression of genes which would normally be regulated by testosterone. 32 refs., 1 fig., 2 tabs.« less

  18. A Prototype System for Retrieval of Gene Functional Information

    PubMed Central

    Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.

    2003-01-01

    Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346

  19. Combining Evidence of Preferential Gene-Tissue Relationships from Multiple Sources

    PubMed Central

    Guo, Jing; Hammar, Mårten; Öberg, Lisa; Padmanabhuni, Shanmukha S.; Bjäreland, Marcus; Dalevi, Daniel

    2013-01-01

    An important challenge in drug discovery and disease prognosis is to predict genes that are preferentially expressed in one or a few tissues, i.e. showing a considerably higher expression in one tissue(s) compared to the others. Although several data sources and methods have been published explicitly for this purpose, they often disagree and it is not evident how to retrieve these genes and how to distinguish true biological findings from those that are due to choice-of-method and/or experimental settings. In this work we have developed a computational approach that combines results from multiple methods and datasets with the aim to eliminate method/study-specific biases and to improve the predictability of preferentially expressed human genes. A rule-based score is used to merge and assign support to the results. Five sets of genes with known tissue specificity were used for parameter pruning and cross-validation. In total we identify 3434 tissue-specific genes. We compare the genes of highest scores with the public databases: PaGenBase (microarray), TiGER (EST) and HPA (protein expression data). The results have 85% overlap to PaGenBase, 71% to TiGER and only 28% to HPA. 99% of our predictions have support from at least one of these databases. Our approach also performs better than any of the databases on identifying drug targets and biomarkers with known tissue-specificity. PMID:23950964

  20. Linking genes to diseases with a SNPedia-Gene Wiki mashup

    PubMed Central

    2012-01-01

    Background A variety of topic-focused wikis are used in the biomedical sciences to enable the mass-collaborative synthesis and distribution of diverse bodies of knowledge. To address complex problems such as defining the relationships between genes and disease, it is important to bring the knowledge from many different domains together. Here we show how advances in wiki technology and natural language processing can be used to automatically assemble ‘meta-wikis’ that present integrated views over the data collaboratively created in multiple source wikis. Results We produced a semantic meta-wiki called the Gene Wiki+ that automatically mirrors and integrates data from the Gene Wiki and SNPedia. The Gene Wiki+, available at (http://genewikiplus.org/), captures 8,047 distinct gene-disease relationships. SNPedia accounts for 4,149 of the gene-disease pairs, the Gene Wiki provides 4,377 and only 479 appear independently in both sources. All of this content is available to query and browse and is provided as linked open data. Conclusions Wikis contain increasing amounts of diverse, biological information useful for elucidating the connections between genes and disease. The Gene Wiki+ shows how wiki technology can be used in concert with natural language processing to provide integrated views over diverse underlying data sources. PMID:22541597

  1. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    PubMed Central

    Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J

    2008-01-01

    Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client side software setup or installation required. Source code is freely available to researchers interested in setting up a local version of PSAT for analysis of genomes not available through the public server. Access to the public web server and instructions for obtaining source code can be found at . PMID:18366802

  2. TRAM (Transcriptome Mapper): database-driven creation and analysis of transcriptome maps from multiple sources

    PubMed Central

    2011-01-01

    Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format) and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper) is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays), implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile), useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples) and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene clusters with differential expression during the differentiation toward megakaryocyte were identified. Conclusions TRAM is designed to create, and statistically analyze, quantitative transcriptome maps, based on gene expression data from multiple sources. The release includes FileMaker Pro database management runtime application and it is freely available at http://apollo11.isto.unibo.it/software/, along with preconfigured implementations for mapping of human, mouse and zebrafish transcriptomes. PMID:21333005

  3. Antimicrobial resistance of Escherichia coli isolates from broiler chickens and humans

    PubMed Central

    Miles, Tricia D; McLaughlin, Wayne; Brown, Paul D

    2006-01-01

    Background Antimicrobial usage is considered the most important factor promoting the emergence, selection and dissemination of antimicrobial-resistant microorganisms in both veterinary and human medicine. The aim of this study was to investigate the prevalence and genetic basis of tetracycline resistance in faecal Escherichia coli isolates from healthy broiler chickens and compare these data with isolates obtained from hospitalized patients in Jamaica. Results Eighty-two E. coli strains isolated from faecal samples of broiler chickens and urine and wound specimens of hospitalized patients were analyzed by agar disc diffusion to determine their susceptibility patterns to 11 antimicrobial agents. Tetracycline resistance determinants were investigated by plasmid profiling, transformations, and amplification of plasmid-borne resistance genes. Tetracycline resistance occurred at a frequency of 82.4% in avian isolates compared to 43.8% in human isolates. In addition, among avian isolates there was a trend towards higher resistance frequencies to kanamycin and nalidixic acid (p < 0.05), while a greater percentage of human isolates were resistant to chloramphenicol and gentamicin (p < 0.05). Multiple drug resistance was found in isolates from both sources and was usually associated with tetracycline resistance. Tetracycline-resistant isolates from both avian and human sources contained one or several plasmids, which were transmissible by transformation of chemically-competent E. coli. Tetracycline resistance was mediated by efflux genes tetB and/or tetD. Conclusion The present study highlights the prevalence of multiple drug resistant E. coli among healthy broiler chickens in Jamaica, possibly associated with expression of tetracycline resistance. While there did not appear to be a common source for multiple drug resistance in the strains from avian or human origin, the genes encoding resistance are similar. These results suggest that genes are disseminated in the environment and warrant further investigation of the possibility for avian sources acting as reservoirs for tetracycline resistance. PMID:16460561

  4. Discovering perturbation of modular structure in HIV progression by integrating multiple data sources through non-negative matrix factorization.

    PubMed

    Ray, Sumanta; Maulik, Ujjwal

    2016-12-20

    Detecting perturbation in modular structure during HIV-1 disease progression is an important step to understand stage specific infection pattern of HIV-1 virus in human cell. In this article, we proposed a novel methodology on integration of multiple biological information to identify such disruption in human gene module during different stages of HIV-1 infection. We integrate three different biological information: gene expression information, protein-protein interaction information and gene ontology information in single gene meta-module, through non negative matrix factorization (NMF). As the identified metamodules inherit those information so, detecting perturbation of these, reflects the changes in expression pattern, in PPI structure and in functional similarity of genes during the infection progression. To integrate modules of different data sources into strong meta-modules, NMF based clustering is utilized here. Perturbation in meta-modular structure is identified by investigating the topological and intramodular properties and putting rank to those meta-modules using a rank aggregation algorithm. We have also analyzed the preservation structure of significant GO terms in which the human proteins of the meta-modules participate. Moreover, we have performed an analysis to show the change of coregulation pattern of identified transcription factors (TFs) over the HIV progression stages.

  5. Cytokinins and Expression of SWEET, SUT, CWINV and AAP Genes Increase as Pea Seeds Germinate

    PubMed Central

    Jameson, Paula E.; Dhandapani, Pragatheswari; Novak, Ondrej; Song, Jiancheng

    2016-01-01

    Transporter genes and cytokinins are key targets for crop improvement. These genes are active during the development of the seed and its establishment as a strong sink. However, during germination, the seed transitions to being a source for the developing root and shoot. To determine if the sucrose transporter (SUT), amino acid permease (AAP), Sugar Will Eventually be Exported Transporter (SWEET), cell wall invertase (CWINV), cytokinin biosynthesis (IPT), activation (LOG) and degradation (CKX) gene family members are involved in both the sink and source activities of seeds, we used RT-qPCR to determine the expression of multiple gene family members, and LC-MS/MS to ascertain endogenous cytokinin levels in germinating Pisum sativum L. We show that genes that are actively expressed when the seed is a strong sink during its development, are also expressed when the seed is in the reverse role of being an active source during germination and early seedling growth. Cytokinins were detected in the imbibing seeds and were actively biosynthesised during germination. We conclude that, when the above gene family members are targeted for seed yield improvement, a downstream effect on subsequent seed germination or seedling vigour must be taken into consideration. PMID:27916945

  6. DLRS: gene tree evolution in light of a species tree.

    PubMed

    Sjöstrand, Joel; Sennblad, Bengt; Arvestad, Lars; Lagergren, Jens

    2012-11-15

    PrIME-DLRS (or colloquially: 'Delirious') is a phylogenetic software tool to simultaneously infer and reconcile a gene tree given a species tree. It accounts for duplication and loss events, a relaxed molecular clock and is intended for the study of homologous gene families, for example in a comparative genomics setting involving multiple species. PrIME-DLRS uses a Bayesian MCMC framework, where the input is a known species tree with divergence times and a multiple sequence alignment, and the output is a posterior distribution over gene trees and model parameters. PrIME-DLRS is available for Java SE 6+ under the New BSD License, and JAR files and source code can be downloaded from http://code.google.com/p/jprime/. There is also a slightly older C++ version available as a binary package for Ubuntu, with download instructions at http://prime.sbc.su.se. The C++ source code is available upon request. joel.sjostrand@scilifelab.se or jens.lagergren@scilifelab.se. PrIME-DLRS is based on a sound probabilistic model (Åkerborg et al., 2009) and has been thoroughly validated on synthetic and biological datasets (Supplementary Material online).

  7. GENIE: a software package for gene-gene interaction analysis in genetic association studies using multiple GPU or CPU cores.

    PubMed

    Chikkagoudar, Satish; Wang, Kai; Li, Mingyao

    2011-05-26

    Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/.

  8. GENIE: a software package for gene-gene interaction analysis in genetic association studies using multiple GPU or CPU cores

    PubMed Central

    2011-01-01

    Background Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Findings Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. Conclusions GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/. PMID:21615923

  9. Comparing wastewater chemicals, indicator bacteria concentrations, and bacterial pathogen genes as fecal pollution indicators

    USGS Publications Warehouse

    Haack, S.K.; Duris, J.W.; Fogarty, L.R.; Kolpin, D.W.; Focazio, M.J.; Furlong, E.T.; Meyer, M.T.

    2009-01-01

    The objective of this study was to compare fecal indicator bacteria (FIB) (fecal coliforms, Escherichia coli [EC], and enterococci [ENT]) concentrations with a wide array of typical organic wastewater chemicals and selected bacterial genes as indicators of fecal pollution in water samples collected at or near 18 surface water drinking water intakes. Genes tested included esp (indicating human-pathogenic ENT) and nine genes associated with various animal sources of shiga-toxin-producing EC (STEC). Fecal pollution was indicated by genes and/or chemicals for 14 of the 18 tested samples, with little relation to FIB standards. Of 13 samples with <50 EC 100 mL-1, human pharmaceuticals or chemical indicators of wastewater treatment plant effluent occurred in six, veterinary antibiotics were detected in three, and stx1 or stx2 genes (indicating varying animal sources of STEC) were detected in eight. Only the EC eaeA gene was positively correlated with FIB concentrations. Human-source fecal pollution was indicated by the esp gene and the human pharmaceutical carbamazepine in one of the nine samples that met all FIB recreational water quality standards. Escherichia coli rfbO157 and stx2c genes, which are typically associated with cattle sources and are of potential human health significance, were detected in one sample in the absence of tested chemicals. Chemical and gene-based indicators of fecal contamination may be present even when FIB standards are met, and some may, unlike FIB, indicate potential sources. Application of multiple water quality indicators with variable environmental persistence and fate may yield greater confidence in fecal pollution assessment and may inform remediation decisions. Copyright ?? 2009 by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America. All rights reserved.

  10. Chum salmon egg extracts induce upregulation of collagen type I and exert antioxidative effects on human dermal fibroblast cultures.

    PubMed

    Yoshino, Atsushi; Polouliakh, Natalia; Meguro, Akira; Takeuchi, Masaki; Kawagoe, Tatsukata; Mizuki, Nobuhisa

    2016-01-01

    Components of fish roe possess antioxidant and antiaging activities, making them potentially very beneficial natural resources. Here, we investigated chum salmon eggs (CSEs) as a source of active ingredients, including vitamins, unsaturated fatty acids, and proteins. We incubated human dermal fibroblast cultures for 48 hours with high and low concentrations of CSE extracts and analyzed changes in gene expression. Cells treated with CSE extract showed concentration-dependent upregulation of collagen type I genes and of multiple antioxidative genes, including OXR1, TXNRD1, and PRDX family genes. We further conducted in silico phylogenetic footprinting analysis of promoter regions. These results suggested that transcription factors such as acute myeloid leukemia-1a and cyclic adenosine monophosphate response element-binding protein may be involved in the observed upregulation of antioxidative genes. Our results support the idea that CSEs are strong candidate sources of antioxidant materials and cosmeceutically effective ingredients.

  11. Ultrafiltration and Microarray for Detection of Microbial Source Tracking Marker and Pathogen Genes in Riverine and Marine Systems

    PubMed Central

    Li, Xiang; Harwood, Valerie J.; Nayak, Bina

    2016-01-01

    Pathogen identification and microbial source tracking (MST) to identify sources of fecal pollution improve evaluation of water quality. They contribute to improved assessment of human health risks and remediation of pollution sources. An MST microarray was used to simultaneously detect genes for multiple pathogens and indicators of fecal pollution in freshwater, marine water, sewage-contaminated freshwater and marine water, and treated wastewater. Dead-end ultrafiltration (DEUF) was used to concentrate organisms from water samples, yielding a recovery efficiency of >95% for Escherichia coli and human polyomavirus. Whole-genome amplification (WGA) increased gene copies from ultrafiltered samples and increased the sensitivity of the microarray. Viruses (adenovirus, bocavirus, hepatitis A virus, and human polyomaviruses) were detected in sewage-contaminated samples. Pathogens such as Legionella pneumophila, Shigella flexneri, and Campylobacter fetus were detected along with genes conferring resistance to aminoglycosides, beta-lactams, and tetracycline. Nonmetric dimensional analysis of MST marker genes grouped sewage-spiked freshwater and marine samples with sewage and apart from other fecal sources. The sensitivity (percent true positives) of the microarray probes for gene targets anticipated in sewage was 51 to 57% and was lower than the specificity (percent true negatives; 79 to 81%). A linear relationship between gene copies determined by quantitative PCR and microarray fluorescence was found, indicating the semiquantitative nature of the MST microarray. These results indicate that ultrafiltration coupled with WGA provides sufficient nucleic acids for detection of viruses, bacteria, protozoa, and antibiotic resistance genes by the microarray in applications ranging from beach monitoring to risk assessment. PMID:26729716

  12. Prokaryotic Gene Clusters: A Rich Toolbox for Synthetic Biology

    PubMed Central

    Fischbach, Michael; Voigt, Christopher A.

    2014-01-01

    Bacteria construct elaborate nanostructures, obtain nutrients and energy from diverse sources, synthesize complex molecules, and implement signal processing to react to their environment. These complex phenotypes require the coordinated action of multiple genes, which are often encoded in a contiguous region of the genome, referred to as a gene cluster. Gene clusters sometimes contain all of the genes necessary and sufficient for a particular function. As an evolutionary mechanism, gene clusters facilitate the horizontal transfer of the complete function between species. Here, we review recent work on a number of clusters whose functions are relevant to biotechnology. Engineering these clusters has been hindered by their regulatory complexity, the need to balance the expression of many genes, and a lack of tools to design and manipulate DNA at this scale. Advances in synthetic biology will enable the large-scale bottom-up engineering of the clusters to optimize their functions, wake up cryptic clusters, or to transfer them between organisms. Understanding and manipulating gene clusters will move towards an era of genome engineering, where multiple functions can be “mixed-and-matched” to create a designer organism. PMID:21154668

  13. PyPanda: a Python package for gene regulatory network reconstruction

    PubMed Central

    van IJzendoorn, David G.P.; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L.

    2016-01-01

    Summary: PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of ‘omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. Availability and implementation: The open source PyPanda Python package is freely available at http://github.com/davidvi/pypanda. Contact: mkuijjer@jimmy.harvard.edu or d.g.p.van_ijzendoorn@lumc.nl PMID:27402905

  14. PyPanda: a Python package for gene regulatory network reconstruction.

    PubMed

    van IJzendoorn, David G P; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L

    2016-11-01

    PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of 'omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. The open source PyPanda Python package is freely available at http://github.com/davidvi/pypanda CONTACT: mkuijjer@jimmy.harvard.edu or d.g.p.van_ijzendoorn@lumc.nl. © The Author 2016. Published by Oxford University Press.

  15. A physical mechanism of cancer heterogeneity

    NASA Astrophysics Data System (ADS)

    Chen, Cong; Wang, Jin

    2016-02-01

    We studied a core cancer gene regulatory network motif to uncover possible source of cancer heterogeneity from epigenetic sources. When the time scale of the protein regulation to the gene is faster compared to the protein synthesis and degradation (adiabatic regime), normal state, cancer state and an intermediate premalignant state emerge. Due to the epigenetics such as DNA methylation and histone remodification, the time scale of the protein regulation to the gene can be slower or comparable to the protein synthesis and degradation (non-adiabatic regime). In this case, many more states emerge as possible phenotype alternations. This gives the origin of the heterogeneity. The cancer heterogeneity is reflected from the emergence of more phenotypic states, larger protein concentration fluctuations, wider kinetic distributions and multiplicity of kinetic paths from normal to cancer state, higher energy cost per gene switching, and weaker stability.

  16. Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of "towards analysis" in chronic lymphocytic leukaemia.

    PubMed

    Kavakiotis, Ioannis; Xochelli, Aliki; Agathangelidis, Andreas; Tsoumakas, Grigorios; Maglaveras, Nicos; Stamatopoulos, Kostas; Hadzidimitriou, Anastasia; Vlahavas, Ioannis; Chouvarda, Ioanna

    2016-06-06

    Somatic Hypermutation (SHM) refers to the introduction of mutations within rearranged V(D)J genes, a process that increases the diversity of Immunoglobulins (IGs). The analysis of SHM has offered critical insight into the physiology and pathology of B cells, leading to strong prognostication markers for clinical outcome in chronic lymphocytic leukaemia (CLL), the most frequent adult B-cell malignancy. In this paper we present a methodology for integrating multiple immunogenetic and clinocobiological data sources in order to extract features and create high quality datasets for SHM analysis in IG receptors of CLL patients. This dataset is used as the basis for a higher level integration procedure, inspired form social choice theory. This is applied in the Towards Analysis, our attempt to investigate the potential ontogenetic transformation of genes belonging to specific stereotyped CLL subsets towards other genes or gene families, through SHM. The data integration process, followed by feature extraction, resulted in the generation of a dataset containing information about mutations occurring through SHM. The Towards analysis performed on the integrated dataset applying voting techniques, revealed the distinct behaviour of subset #201 compared to other subsets, as regards SHM related movements among gene clans, both in allele-conserved and non-conserved gene areas. With respect to movement between genes, a high percentage movement towards pseudo genes was found in all CLL subsets. This data integration and feature extraction process can set the basis for exploratory analysis or a fully automated computational data mining approach on many as yet unanswered, clinically relevant biological questions.

  17. Gene finding in metatranscriptomic sequences.

    PubMed

    Ismail, Wazim Mohammed; Ye, Yuzhen; Tang, Haixu

    2014-01-01

    Metatranscriptomic sequencing is a highly sensitive bioassay of functional activity in a microbial community, providing complementary information to the metagenomic sequencing of the community. The acquisition of the metatranscriptomic sequences will enable us to refine the annotations of the metagenomes, and to study the gene activities and their regulation in complex microbial communities and their dynamics. In this paper, we present TransGeneScan, a software tool for finding genes in assembled transcripts from metatranscriptomic sequences. By incorporating several features of metatranscriptomic sequencing, including strand-specificity, short intergenic regions, and putative antisense transcripts into a Hidden Markov Model, TranGeneScan can predict a sense transcript containing one or multiple genes (in an operon) or an antisense transcript. We tested TransGeneScan on a mock metatranscriptomic data set containing three known bacterial genomes. The results showed that TranGeneScan performs better than metagenomic gene finders (MetaGeneMark and FragGeneScan) on predicting protein coding genes in assembled transcripts, and achieves comparable or even higher accuracy than gene finders for microbial genomes (Glimmer and GeneMark). These results imply, with the assistance of metatranscriptomic sequencing, we can obtain a broad and precise picture about the genes (and their functions) in a microbial community. TransGeneScan is available as open-source software on SourceForge at https://sourceforge.net/projects/transgenescan/.

  18. Integrating Multiple Data Sources for Combinatorial Marker Discovery: A Study in Tumorigenesis.

    PubMed

    Bandyopadhyay, Sanghamitra; Mallik, Saurav

    2018-01-01

    Identification of combinatorial markers from multiple data sources is a challenging task in bioinformatics. Here, we propose a novel computational framework for identifying significant combinatorial markers ( s) using both gene expression and methylation data. The gene expression and methylation data are integrated into a single continuous data as well as a (post-discretized) boolean data based on their intrinsic (i.e., inverse) relationship. A novel combined score of methylation and expression data (viz., ) is introduced which is computed on the integrated continuous data for identifying initial non-redundant set of genes. Thereafter, (maximal) frequent closed homogeneous genesets are identified using a well-known biclustering algorithm applied on the integrated boolean data of the determined non-redundant set of genes. A novel sample-based weighted support ( ) is then proposed that is consecutively calculated on the integrated boolean data of the determined non-redundant set of genes in order to identify the non-redundant significant genesets. The top few resulting genesets are identified as potential s. Since our proposed method generates a smaller number of significant non-redundant genesets than those by other popular methods, the method is much faster than the others. Application of the proposed technique on an expression and a methylation data for Uterine tumor or Prostate Carcinoma produces a set of significant combination of markers. We expect that such a combination of markers will produce lower false positives than individual markers.

  19. Predicting Gene Structures from Multiple RT-PCR Tests

    NASA Astrophysics Data System (ADS)

    Kováč, Jakub; Vinař, Tomáš; Brejová, Broňa

    It has been demonstrated that the use of additional information such as ESTs and protein homology can significantly improve accuracy of gene prediction. However, many sources of external information are still being omitted from consideration. Here, we investigate the use of product lengths from RT-PCR experiments in gene finding. We present hardness results and practical algorithms for several variants of the problem and apply our methods to a real RT-PCR data set in the Drosophila genome. We conclude that the use of RT-PCR data can improve the sensitivity of gene prediction and locate novel splicing variants.

  20. SNPGenie: estimating evolutionary parameters to detect natural selection using pooled next-generation sequencing data.

    PubMed

    Nelson, Chase W; Moncla, Louise H; Hughes, Austin L

    2015-11-15

    New applications of next-generation sequencing technologies use pools of DNA from multiple individuals to estimate population genetic parameters. However, no publicly available tools exist to analyse single-nucleotide polymorphism (SNP) calling results directly for evolutionary parameters important in detecting natural selection, including nucleotide diversity and gene diversity. We have developed SNPGenie to fill this gap. The user submits a FASTA reference sequence(s), a Gene Transfer Format (.GTF) file with CDS information and a SNP report(s) in an increasing selection of formats. The program estimates nucleotide diversity, distance from the reference and gene diversity. Sites are flagged for multiple overlapping reading frames, and are categorized by polymorphism type: nonsynonymous, synonymous, or ambiguous. The results allow single nucleotide, single codon, sliding window, whole gene and whole genome/population analyses that aid in the detection of positive and purifying natural selection in the source population. SNPGenie version 1.2 is a Perl program with no additional dependencies. It is free, open-source, and available for download at https://github.com/hugheslab/snpgenie. nelsoncw@email.sc.edu or austin@biol.sc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Loci under selection during multiple range expansions of an invasive plant are mostly population specific, but patterns are associated with climate.

    PubMed

    Zenni, Rafael D; Hoban, Sean M

    2015-07-01

    Identifying the genes underlying rapid evolutionary changes, describing their function and ascertaining the environmental pressures that determine fitness are the central elements needed for understanding of evolutionary processes and phenotypic changes that improve the fitness of populations. It has been hypothesized that rapid adaptive changes in new environments may contribute to the rapid spread and success of invasive plants and animals. As yet, studies of adaptation during invasion are scarce, as is knowledge of the genes underlying adaptation, especially in multiple replicated invasions. Here, we quantified how genotype frequencies change during invasions, resulting in rapid evolution of naturalized populations. We used six fully replicated common garden experiments in Brazil where Pinus taeda (loblolly pine) was introduced at the same time, in the same numbers, from the same seed sources, and has formed naturalized populations expanding outward from the plantations. We used a combination of nonparametric, population genetics and multivariate statistics to detect changes in genotype frequencies along each of the six naturalization gradients and their association with climate as well as shifts in allele frequencies compared to the source populations. Results show 25 genes with significant shifts in genotype frequencies. Six genes had shifts in more than one population. Climate explained 25% of the variation in the groups of genes under selection across all locations, but specific genes under strong selection during invasions did not show climate-related convergence. In conclusion, we detected rapid evolutionary changes during invasive range expansions, but the particular gene-level patterns of evolution may be population specific. © 2015 John Wiley & Sons Ltd.

  2. Analysis and functional annotation of expressed sequence tags (ESTs) from multiple tissues of oil palm (Elaeis guineensis Jacq.)

    PubMed Central

    Ho, Chai-Ling; Kwan, Yen-Yen; Choi, Mei-Chooi; Tee, Sue-Sean; Ng, Wai-Har; Lim, Kok-Ang; Lee, Yang-Ping; Ooi, Siew-Eng; Lee, Weng-Wah; Tee, Jin-Ming; Tan, Siang-Hee; Kulaveerasingam, Harikrishna; Alwee, Sharifah Shahrul Rabiah Syed; Abdullah, Meilina Ong

    2007-01-01

    Background Oil palm is the second largest source of edible oil which contributes to approximately 20% of the world's production of oils and fats. In order to understand the molecular biology involved in in vitro propagation, flowering, efficient utilization of nitrogen sources and root diseases, we have initiated an expressed sequence tag (EST) analysis on oil palm. Results In this study, six cDNA libraries from oil palm zygotic embryos, suspension cells, shoot apical meristems, young flowers, mature flowers and roots, were constructed. We have generated a total of 14537 expressed sequence tags (ESTs) from these libraries, from which 6464 tentative unique contigs (TUCs) and 2129 singletons were obtained. Approximately 6008 of these tentative unique genes (TUGs) have significant matches to the non-redundant protein database, from which 2361 were assigned to one or more Gene Ontology categories. Predominant transcripts and differentially expressed genes were identified in multiple oil palm tissues. Homologues of genes involved in many aspects of flower development were also identified among the EST collection, such as CONSTANS-like, AGAMOUS-like (AGL)2, AGL20, LFY-like, SQUAMOSA, SQUAMOSA binding protein (SBP) etc. Majority of them are the first representatives in oil palm, providing opportunities to explore the cause of epigenetic homeotic flowering abnormality in oil palm, given the importance of flowering in fruit production. The transcript levels of two flowering-related genes, EgSBP and EgSEP were analysed in the flower tissues of various developmental stages. Gene homologues for enzymes involved in oil biosynthesis, utilization of nitrogen sources, and scavenging of oxygen radicals, were also uncovered among the oil palm ESTs. Conclusion The EST sequences generated will allow comparative genomic studies between oil palm and other monocotyledonous and dicotyledonous plants, development of gene-targeted markers for the reference genetic map, design and fabrication of DNA array for future studies of oil palm. The outcomes of such studies will contribute to oil palm improvements through the establishment of breeding program using marker-assisted selection, development of diagnostic assays using gene targeted markers, and discovery of candidate genes related to important agronomic traits of oil palm. PMID:17953740

  3. Preponderance of toxigenic Escherichia coli in stool pathogens correlates with toxin detection in accessible drinking-water sources.

    PubMed

    Igbokwe, H; Bhattacharyya, S; Gradus, S; Khubbar, M; Griswold, D; Navidad, J; Igwilo, C; Masson-Meyers, D; Azenabor, A A

    2015-02-01

    Since early detection of pathogens and their virulence factors contribute to intervention and control strategies, we assessed the enteropathogens in diarrhoea disease and investigated the link between toxigenic strains of Escherichia coli from stool and drinking-water sources; and determined the expression of toxin genes by antibiotic-resistant E. coli in Lagos, Nigeria. This was compared with isolates from diarrhoeal stool and water from Wisconsin, USA. The new Luminex xTAG GPP (Gastroplex) technique and conventional real-time PCR were used to profile enteric pathogens and E. coli toxin gene isolates, respectively. Results showed the pathogen profile of stool and indicated a relationship between E. coli toxin genes in water and stool from Lagos which was absent in Wisconsin isolates. The Gastroplex technique was efficient for multiple enteric pathogens and toxin gene detection. The co-existence of antibiotic resistance with enteroinvasive E. coli toxin genes suggests an additional prognostic burden on patients.

  4. Chum salmon egg extracts induce upregulation of collagen type I and exert antioxidative effects on human dermal fibroblast cultures

    PubMed Central

    Yoshino, Atsushi; Polouliakh, Natalia; Meguro, Akira; Takeuchi, Masaki; Kawagoe, Tatsukata; Mizuki, Nobuhisa

    2016-01-01

    Components of fish roe possess antioxidant and antiaging activities, making them potentially very beneficial natural resources. Here, we investigated chum salmon eggs (CSEs) as a source of active ingredients, including vitamins, unsaturated fatty acids, and proteins. We incubated human dermal fibroblast cultures for 48 hours with high and low concentrations of CSE extracts and analyzed changes in gene expression. Cells treated with CSE extract showed concentration-dependent upregulation of collagen type I genes and of multiple antioxidative genes, including OXR1, TXNRD1, and PRDX family genes. We further conducted in silico phylogenetic footprinting analysis of promoter regions. These results suggested that transcription factors such as acute myeloid leukemia-1a and cyclic adenosine monophosphate response element-binding protein may be involved in the observed upregulation of antioxidative genes. Our results support the idea that CSEs are strong candidate sources of antioxidant materials and cosmeceutically effective ingredients. PMID:27621603

  5. Seeking unique and common biological themes in multiple gene lists or datasets: pathway pattern extraction pipeline for pathway-level comparative analysis.

    PubMed

    Yi, Ming; Mudunuri, Uma; Che, Anney; Stephens, Robert M

    2009-06-29

    One of the challenges in the analysis of microarray data is to integrate and compare the selected (e.g., differential) gene lists from multiple experiments for common or unique underlying biological themes. A common way to approach this problem is to extract common genes from these gene lists and then subject these genes to enrichment analysis to reveal the underlying biology. However, the capacity of this approach is largely restricted by the limited number of common genes shared by datasets from multiple experiments, which could be caused by the complexity of the biological system itself. We now introduce a new Pathway Pattern Extraction Pipeline (PPEP), which extends the existing WPS application by providing a new pathway-level comparative analysis scheme. To facilitate comparing and correlating results from different studies and sources, PPEP contains new interfaces that allow evaluation of the pathway-level enrichment patterns across multiple gene lists. As an exploratory tool, this analysis pipeline may help reveal the underlying biological themes at both the pathway and gene levels. The analysis scheme provided by PPEP begins with multiple gene lists, which may be derived from different studies in terms of the biological contexts, applied technologies, or methodologies. These lists are then subjected to pathway-level comparative analysis for extraction of pathway-level patterns. This analysis pipeline helps to explore the commonality or uniqueness of these lists at the level of pathways or biological processes from different but relevant biological systems using a combination of statistical enrichment measurements, pathway-level pattern extraction, and graphical display of the relationships of genes and their associated pathways as Gene-Term Association Networks (GTANs) within the WPS platform. As a proof of concept, we have used the new method to analyze many datasets from our collaborators as well as some public microarray datasets. This tool provides a new pathway-level analysis scheme for integrative and comparative analysis of data derived from different but relevant systems. The tool is freely available as a Pathway Pattern Extraction Pipeline implemented in our existing software package WPS, which can be obtained at http://www.abcc.ncifcrf.gov/wps/wps_index.php.

  6. Geographic setting influences Great Lakes beach microbiological water quality

    USGS Publications Warehouse

    Haack, Sheridan K.; Fogarty, Lisa R.; Stelzer, Erin A.; Fuller, Lori M.; Brennan, Angela K.; Isaacs, Natasha M.; Johnson, Heather E.

    2013-01-01

    Understanding of factors that influence Escherichia coli (EC) and enterococci (ENT) concentrations, pathogen occurrence, and microbial sources at Great Lakes beaches comes largely from individual beach studies. Using 12 representative beaches, we tested enrichment cultures from 273 beach water and 22 tributary samples for EC, ENT, and genes indicating the bacterial pathogens Shiga-toxin producing E. coli (STEC), Shigella spp., Salmonella spp, Campylobacter jejuni/coli, and methicillin-resistant Staphylococcus aureus, and 108–145 samples for Bacteroides human, ruminant, and gull source-marker genes. EC/ENT temporal patterns, general Bacteroides concentration, and pathogen types and occurrence were regionally consistent (up to 40 km), but beach catchment variables (drains/creeks, impervious surface, urban land cover) influenced exceedances of EC/ENT standards and detections of Salmonella and STEC. Pathogen detections were more numerous when the EC/ENT Beach Action Value (but not when the Geometric Mean and Statistical Threshold Value) was exceeded. EC, ENT, and pathogens were not necessarily influenced by the same variables. Multiple Bacteroides sources, varying by date, occurred at every beach. Study of multiple beaches in different geographic settings provided new insights on the contrasting influences of regional and local variables, and a broader-scale perspective, on significance of EC/ENT exceedances, bacterial sources, and pathogen occurrence.

  7. CARHTA GENE: multipopulation integrated genetic and radiation hybrid mapping.

    PubMed

    de Givry, Simon; Bouchez, Martin; Chabrier, Patrick; Milan, Denis; Schiex, Thomas

    2005-04-15

    CAR(H)(T)A GENE: is an integrated genetic and radiation hybrid (RH) mapping tool which can deal with multiple populations, including mixtures of genetic and RH data. CAR(H)(T)A GENE: performs multipoint maximum likelihood estimations with accelerated expectation-maximization algorithms for some pedigrees and has sophisticated algorithms for marker ordering. Dedicated heuristics for framework mapping are also included. CAR(H)(T)A GENE: can be used as a C++ library, through a shell command and a graphical interface. The XML output for companion tools is integrated. The program is available free of charge from www.inra.fr/bia/T/CarthaGene for Linux, Windows and Solaris machines (with Open Source). tschiex@toulouse.inra.fr.

  8. CoNekT: an open-source framework for comparative genomic and transcriptomic network analyses.

    PubMed

    Proost, Sebastian; Mutwil, Marek

    2018-05-01

    The recent accumulation of gene expression data in the form of RNA sequencing creates unprecedented opportunities to study gene regulation and function. Furthermore, comparative analysis of the expression data from multiple species can elucidate which functional gene modules are conserved across species, allowing the study of the evolution of these modules. However, performing such comparative analyses on raw data is not feasible for many biologists. Here, we present CoNekT (Co-expression Network Toolkit), an open source web server, that contains user-friendly tools and interactive visualizations for comparative analyses of gene expression data and co-expression networks. These tools allow analysis and cross-species comparison of (i) gene expression profiles; (ii) co-expression networks; (iii) co-expressed clusters involved in specific biological processes; (iv) tissue-specific gene expression; and (v) expression profiles of gene families. To demonstrate these features, we constructed CoNekT-Plants for green alga, seed plants and flowering plants (Picea abies, Chlamydomonas reinhardtii, Vitis vinifera, Arabidopsis thaliana, Oryza sativa, Zea mays and Solanum lycopersicum) and thus provide a web-tool with the broadest available collection of plant phyla. CoNekT-Plants is freely available from http://conekt.plant.tools, while the CoNekT source code and documentation can be found at https://github.molgen.mpg.de/proost/CoNekT/.

  9. Transcriptional Profiling of the Iron Starvation Response in Bordetella pertussis Provides New Insights into Siderophore Utilization and Virulence Gene Expression ▿ §

    PubMed Central

    Brickman, Timothy J.; Cummings, Craig A.; Liew, Sin-Yee; Relman, David A.; Armstrong, Sandra K.

    2011-01-01

    Serological studies of patients with pertussis and the identification of antigenic Bordetella pertussis proteins support the hypothesis that B. pertussis perceives an iron starvation cue and expresses multiple iron source utilization systems in its natural human host environment. Furthermore, previous studies using a murine respiratory tract infection model showed that several of these B. pertussis iron systems are required for colonization and persistence and are differentially expressed over the course of infection. The present study examined genome-wide changes in B. pertussis gene transcript abundance in response to iron starvation in vitro. In addition to known iron source utilization genes, we identified a previously uncharacterized iron-repressed cytoplasmic membrane transporter system, fbpABC, that is required for the utilization of multiple structurally distinct siderophores including alcaligin, enterobactin, ferrichrome, and desferrioxamine B. Expression of type III secretion system genes was also found to be upregulated during iron starvation in both B. pertussis strain Tohama I and Bordetella bronchiseptica strain RB50. In a survey of type III secretion system protein production by an assortment of B. pertussis laboratory-adapted and low-passage clinical isolate strains, iron limitation increased the production and secretion of the type III secretion system-specific translocation apparatus tip protein Bsp22 in all Bvg-proficient strains. These results indicate that iron starvation in the infected host is an important environmental cue influencing not only Bordetella iron transport gene expression but also the expression of other important virulence-associated genes. PMID:21742863

  10. SnipViz: a compact and lightweight web site widget for display and dissemination of multiple versions of gene and protein sequences.

    PubMed

    Jaschob, Daniel; Davis, Trisha N; Riffle, Michael

    2014-07-23

    As high throughput sequencing continues to grow more commonplace, the need to disseminate the resulting data via web applications continues to grow. Particularly, there is a need to disseminate multiple versions of related gene and protein sequences simultaneously--whether they represent alleles present in a single species, variations of the same gene among different strains, or homologs among separate species. Often this is accomplished by displaying all versions of the sequence at once in a manner that is not intuitive or space-efficient and does not facilitate human understanding of the data. Web-based applications needing to disseminate multiple versions of sequences would benefit from a drop-in module designed to effectively disseminate these data. SnipViz is a client-side software tool designed to disseminate multiple versions of related gene and protein sequences on web sites. SnipViz has a space-efficient, interactive, and dynamic interface for navigating, analyzing and visualizing sequence data. It is written using standard World Wide Web technologies (HTML, Javascript, and CSS) and is compatible with most web browsers. SnipViz is designed as a modular client-side web component and may be incorporated into virtually any web site and be implemented without any programming. SnipViz is a drop-in client-side module for web sites designed to efficiently visualize and disseminate gene and protein sequences. SnipViz is open source and is freely available at https://github.com/yeastrc/snipviz.

  11. Digital detection of multiple minority mutants and expression levels of multiple colorectal cancer-related genes using digital-PCR coupled with bead-array.

    PubMed

    Huang, Huan; Li, Shuo; Sun, Lizhou; Zhou, Guohua

    2015-01-01

    To simultaneously analyze mutations and expression levels of multiple genes on one detection platform, we proposed a method termed "multiplex ligation-dependent probe amplification-digital amplification coupled with hydrogel bead-array" (MLPA-DABA) and applied it to diagnose colorectal cancer (CRC). CRC cells and tissues were sampled to extract nucleic acid, perform MLPA with sequence-tagged probes, perform digital emulsion polymerase chain reaction (PCR), and produce a hydrogel bead-array to immobilize beads and form a single bead layer on the array. After hybridization with fluorescent probes, the number of colored beads, which reflects the abundance of expressed genes and the mutation rate, was counted for diagnosis. Only red or green beads occurred on the chips in the mixed samples, indicating the success of single-molecule PCR. When a one-source sample was analyzed using mixed MLPA probes, beads of only one color occurred, suggesting the high specificity of the method in analyzing CRC mutation and gene expression. In gene expression analysis of a CRC tissue from one CRC patient, the mutant percentage was 3.1%, and the expression levels of CRC-related genes were much higher than those of normal tissue. The highly sensitive MLPA-DABA succeeds in the relative quantification of mutations and gene expressions of exfoliated cells in stool samples of CRC patients on the same chip platform. MLPA-DABA coupled with hydrogel bead-array is a promising method in the non-invasive diagnosis of CRC.

  12. A fast and high performance multiple data integration algorithm for identifying human disease genes

    PubMed Central

    2015-01-01

    Background Integrating multiple data sources is indispensable in improving disease gene identification. It is not only due to the fact that disease genes associated with similar genetic diseases tend to lie close with each other in various biological networks, but also due to the fact that gene-disease associations are complex. Although various algorithms have been proposed to identify disease genes, their prediction performances and the computational time still should be further improved. Results In this study, we propose a fast and high performance multiple data integration algorithm for identifying human disease genes. A posterior probability of each candidate gene associated with individual diseases is calculated by using a Bayesian analysis method and a binary logistic regression model. Two prior probability estimation strategies and two feature vector construction methods are developed to test the performance of the proposed algorithm. Conclusions The proposed algorithm is not only generated predictions with high AUC scores, but also runs very fast. When only a single PPI network is employed, the AUC score is 0.769 by using F2 as feature vectors. The average running time for each leave-one-out experiment is only around 1.5 seconds. When three biological networks are integrated, the AUC score using F3 as feature vectors increases to 0.830, and the average running time for each leave-one-out experiment takes only about 12.54 seconds. It is better than many existing algorithms. PMID:26399620

  13. Use of Network Inference to Elucidate Common and Chemical-specific Effects on Steoidogenesis

    EPA Science Inventory

    Microarray data is a key source for modeling gene regulatory interactions. Regulatory network models based on multiple datasets are potentially more robust and can provide greater confidence. In this study, we used network modeling on microarray data generated by exposing the fat...

  14. Pathogenic diversity of Phytophthora sojae and breeding strategies to develop Phytophthora-resistant soybeans

    PubMed Central

    Sugimoto, Takuma; Kato, Masayasu; Yoshida, Shinya; Matsumoto, Isao; Kobayashi, Tamotsu; Kaga, Akito; Hajika, Makita; Yamamoto, Ryo; Watanabe, Kazuhiko; Aino, Masataka; Matoh, Toru; Walker, David R.; Biggs, Alan R.; Ishimoto, Masao

    2012-01-01

    Phytophthora stem and root rot, caused by Phytophthora sojae, is one of the most destructive diseases of soybean [Glycine max (L.) Merr.], and the incidence of this disease has been increasing in several soybean-producing areas around the world. This presents serious limitations for soybean production, with yield losses from 4 to 100%. The most effective method to reduce damage would be to grow Phytophthora-resistant soybean cultivars, and two types of host resistance have been described. Race-specific resistance conditioned by single dominant Rps (“resistance to Phytophthora sojae”) genes and quantitatively inherited partial resistance conferred by multiple genes could both provide protection from the pathogen. Molecular markers linked to Rps genes or quantitative trait loci (QTLs) underlying partial resistance have been identified on several molecular linkage groups corresponding to chromosomes. These markers can be used to screen for Phytophthora-resistant plants rapidly and efficiently, and to combine multiple resistance genes in the same background. This paper reviews what is currently known about pathogenic races of P. sojae in the USA and Japan, selection of sources of Rps genes or minor genes providing partial resistance, and the current state and future scope of breeding Phytophthora-resistant soybean cultivars. PMID:23136490

  15. Multiple environmental factors regulate the expression of the carbohydrate-selective OprB porin of Pseudomonas aeruginosa.

    PubMed

    Adewoye, L O; Worobec, E A

    1999-12-01

    In response to low extracellular glucose concentration, Pseudomonas aeruginosa induces the expression of the outer membrane carbohydrate-selective OprB porin. The promoter region of the oprB gene was cloned into a lacZ transcriptional fusion vector, and the construct was mobilized into P. aeruginosa OprB-deficient strain, WW100, to evaluate additional environmental factors that influence OprB porin gene expression. Growth temperature, pH of the growth medium, salicylate concentration, and carbohydrate source were found to differentially influence porin expression. This expression pattern was compared to those of whole-cell [14C]glucose uptake under conditions of high osmolarity, ionicity, variable pH, growth temperatures, and carbohydrate source. These studies revealed that the high-affinity glucose transport genes are down-regulated by salicylic acid, differentially regulated by pH and temperature, and are specifically responsive to exogenous glucose induction.

  16. SZDB: A Database for Schizophrenia Genetic Research

    PubMed Central

    Wu, Yong; Yao, Yong-Gang

    2017-01-01

    Abstract Schizophrenia (SZ) is a debilitating brain disorder with a complex genetic architecture. Genetic studies, especially recent genome-wide association studies (GWAS), have identified multiple variants (loci) conferring risk to SZ. However, how to efficiently extract meaningful biological information from bulk genetic findings of SZ remains a major challenge. There is a pressing need to integrate multiple layers of data from various sources, eg, genetic findings from GWAS, copy number variations (CNVs), association and linkage studies, gene expression, protein–protein interaction (PPI), co-expression, expression quantitative trait loci (eQTL), and Encyclopedia of DNA Elements (ENCODE) data, to provide a comprehensive resource to facilitate the translation of genetic findings into SZ molecular diagnosis and mechanism study. Here we developed the SZDB database (http://www.szdb.org/), a comprehensive resource for SZ research. SZ genetic data, gene expression data, network-based data, brain eQTL data, and SNP function annotation information were systematically extracted, curated and deposited in SZDB. In-depth analyses and systematic integration were performed to identify top prioritized SZ genes and enriched pathways. Multiple types of data from various layers of SZ research were systematically integrated and deposited in SZDB. In-depth data analyses and integration identified top prioritized SZ genes and enriched pathways. We further showed that genes implicated in SZ are highly co-expressed in human brain and proteins encoded by the prioritized SZ risk genes are significantly interacted. The user-friendly SZDB provides high-confidence candidate variants and genes for further functional characterization. More important, SZDB provides convenient online tools for data search and browse, data integration, and customized data analyses. PMID:27451428

  17. Digital Detection of Multiple Minority Mutants and Expression Levels of Multiple Colorectal Cancer-Related Genes Using Digital-PCR Coupled with Bead-Array

    PubMed Central

    Huang, Huan; Li, Shuo; Sun, Lizhou; Zhou, Guohua

    2015-01-01

    To simultaneously analyze mutations and expression levels of multiple genes on one detection platform, we proposed a method termed “multiplex ligation-dependent probe amplification–digital amplification coupled with hydrogel bead-array” (MLPA–DABA) and applied it to diagnose colorectal cancer (CRC). CRC cells and tissues were sampled to extract nucleic acid, perform MLPA with sequence-tagged probes, perform digital emulsion polymerase chain reaction (PCR), and produce a hydrogel bead-array to immobilize beads and form a single bead layer on the array. After hybridization with fluorescent probes, the number of colored beads, which reflects the abundance of expressed genes and the mutation rate, was counted for diagnosis. Only red or green beads occurred on the chips in the mixed samples, indicating the success of single-molecule PCR. When a one-source sample was analyzed using mixed MLPA probes, beads of only one color occurred, suggesting the high specificity of the method in analyzing CRC mutation and gene expression. In gene expression analysis of a CRC tissue from one CRC patient, the mutant percentage was 3.1%, and the expression levels of CRC-related genes were much higher than those of normal tissue. The highly sensitive MLPA–DABA succeeds in the relative quantification of mutations and gene expressions of exfoliated cells in stool samples of CRC patients on the same chip platform. MLPA–DABA coupled with hydrogel bead-array is a promising method in the non-invasive diagnosis of CRC. PMID:25880764

  18. A two-step hierarchical hypothesis set testing framework, with applications to gene expression data on ordered categories

    PubMed Central

    2014-01-01

    Background In complex large-scale experiments, in addition to simultaneously considering a large number of features, multiple hypotheses are often being tested for each feature. This leads to a problem of multi-dimensional multiple testing. For example, in gene expression studies over ordered categories (such as time-course or dose-response experiments), interest is often in testing differential expression across several categories for each gene. In this paper, we consider a framework for testing multiple sets of hypothesis, which can be applied to a wide range of problems. Results We adopt the concept of the overall false discovery rate (OFDR) for controlling false discoveries on the hypothesis set level. Based on an existing procedure for identifying differentially expressed gene sets, we discuss a general two-step hierarchical hypothesis set testing procedure, which controls the overall false discovery rate under independence across hypothesis sets. In addition, we discuss the concept of the mixed-directional false discovery rate (mdFDR), and extend the general procedure to enable directional decisions for two-sided alternatives. We applied the framework to the case of microarray time-course/dose-response experiments, and proposed three procedures for testing differential expression and making multiple directional decisions for each gene. Simulation studies confirm the control of the OFDR and mdFDR by the proposed procedures under independence and positive correlations across genes. Simulation results also show that two of our new procedures achieve higher power than previous methods. Finally, the proposed methodology is applied to a microarray dose-response study, to identify 17 β-estradiol sensitive genes in breast cancer cells that are induced at low concentrations. Conclusions The framework we discuss provides a platform for multiple testing procedures covering situations involving two (or potentially more) sources of multiplicity. The framework is easy to use and adaptable to various practical settings that frequently occur in large-scale experiments. Procedures generated from the framework are shown to maintain control of the OFDR and mdFDR, quantities that are especially relevant in the case of multiple hypothesis set testing. The procedures work well in both simulations and real datasets, and are shown to have better power than existing methods. PMID:24731138

  19. The Evolution of Mobile DNAs: When Will Transposons Create Phylogenies That Look As If There Is a Master Gene?

    PubMed Central

    Brookfield, John F. Y.; Johnson, Louise J.

    2006-01-01

    Some families of mammalian interspersed repetitive DNA, such as the Alu SINE sequence, appear to have evolved by the serial replacement of one active sequence with another, consistent with there being a single source of transposition: the “master gene.” Alternative models, in which multiple source sequences are simultaneously active, have been called “transposon models.” Transposon models differ in the proportion of elements that are active and in whether inactivation occurs at the moment of transposition or later. Here we examine the predictions of various types of transposon model regarding the patterns of sequence variation expected at an equilibrium between transposition, inactivation, and deletion. Under the master gene model, all bifurcations in the true tree of elements occur in a single lineage. We show that this property will also hold approximately for transposon models in which most elements are inactive and where at least some of the inactivation events occur after transposition. Such tree shapes are therefore not conclusive evidence for a single source of transposition. PMID:16790583

  20. Genetics of resistance in lettuce to races 1 and 2 of Verticillium dahliae from different host species

    USDA-ARS?s Scientific Manuscript database

    Race 1 resistance against Verticillium dahliae in lettuce was originally shown in the cultivar La Brillante to be conditioned by a single dominant gene (Verticillium resistance 1, Vr1). Multiple, morphologically diverse sources of germplasm have been identified as resistant to race 1. In this study...

  1. Bi-level Multi-Source Learning for Heterogeneous Block-wise Missing Data

    PubMed Central

    Xiang, Shuo; Yuan, Lei; Fan, Wei; Wang, Yalin; Thompson, Paul M.; Ye, Jieping

    2013-01-01

    Bio-imaging technologies allow scientists to collect large amounts of high-dimensional data from multiple heterogeneous sources for many biomedical applications. In the study of Alzheimer's Disease (AD), neuroimaging data, gene/protein expression data, etc., are often analyzed together to improve predictive power. Joint learning from multiple complementary data sources is advantageous, but feature-pruning and data source selection are critical to learn interpretable models from high-dimensional data. Often, the data collected has block-wise missing entries. In the Alzheimer’s Disease Neuroimaging Initiative (ADNI), most subjects have MRI and genetic information, but only half have cerebrospinal fluid (CSF) measures, a different half has FDG-PET; only some have proteomic data. Here we propose how to effectively integrate information from multiple heterogeneous data sources when data is block-wise missing. We present a unified “bi-level” learning model for complete multi-source data, and extend it to incomplete data. Our major contributions are: (1) our proposed models unify feature-level and source-level analysis, including several existing feature learning approaches as special cases; (2) the model for incomplete data avoids imputing missing data and offers superior performance; it generalizes to other applications with block-wise missing data sources; (3) we present efficient optimization algorithms for modeling complete and incomplete data. We comprehensively evaluate the proposed models including all ADNI subjects with at least one of four data types at baseline: MRI, FDG-PET, CSF and proteomics. Our proposed models compare favorably with existing approaches. PMID:23988272

  2. Bi-level multi-source learning for heterogeneous block-wise missing data.

    PubMed

    Xiang, Shuo; Yuan, Lei; Fan, Wei; Wang, Yalin; Thompson, Paul M; Ye, Jieping

    2014-11-15

    Bio-imaging technologies allow scientists to collect large amounts of high-dimensional data from multiple heterogeneous sources for many biomedical applications. In the study of Alzheimer's Disease (AD), neuroimaging data, gene/protein expression data, etc., are often analyzed together to improve predictive power. Joint learning from multiple complementary data sources is advantageous, but feature-pruning and data source selection are critical to learn interpretable models from high-dimensional data. Often, the data collected has block-wise missing entries. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), most subjects have MRI and genetic information, but only half have cerebrospinal fluid (CSF) measures, a different half has FDG-PET; only some have proteomic data. Here we propose how to effectively integrate information from multiple heterogeneous data sources when data is block-wise missing. We present a unified "bi-level" learning model for complete multi-source data, and extend it to incomplete data. Our major contributions are: (1) our proposed models unify feature-level and source-level analysis, including several existing feature learning approaches as special cases; (2) the model for incomplete data avoids imputing missing data and offers superior performance; it generalizes to other applications with block-wise missing data sources; (3) we present efficient optimization algorithms for modeling complete and incomplete data. We comprehensively evaluate the proposed models including all ADNI subjects with at least one of four data types at baseline: MRI, FDG-PET, CSF and proteomics. Our proposed models compare favorably with existing approaches. © 2013 Elsevier Inc. All rights reserved.

  3. Tetramer-organizing polyproline-rich peptides differ in CHO cell-expressed and plasma-derived human butyrylcholinesterase tetramers.

    PubMed

    Schopfer, Lawrence M; Lockridge, Oksana

    2016-06-01

    Tetrameric butyrylcholinesterase (BChE) in human plasma is the product of multiple genes, namely one BCHE gene on chromosome 3q26.1 and multiple genes that encode polyproline-rich peptides. The function of the polyproline-rich peptides is to assemble BChE into tetramers. CHO cells transfected with human BChE cDNA express BChE monomers and dimers, but only low quantities of tetramers. Our goal was to identify the polyproline-rich peptides in CHO-cell derived human BChE tetramers. CHO cell-produced human BChE tetramers were purified from serum-free culture medium. Peptides embedded in the tetramerization domain were released from BChE tetramers by boiling and identified by liquid chromatography-tandem mass spectrometry. A total of 270 proline-rich peptides were sequenced, ranging in size from 6-41 residues. The peptides originated from 60 different proteins that reside in multiple cell compartments including the nucleus, cytoplasm, and endoplasmic reticulum. No single protein was the source of the polyproline-rich peptides in CHO cell-expressed human BChE tetramers. In contrast, 70% of the tetramer-organizing peptides in plasma-derived BChE tetramers originate from lamellipodin. No protein source was identified for polyproline peptides containing up to 41 consecutive proline residues. In conclusion, the use of polyproline-rich peptides as a tetramerization motif is documented only for the cholinesterases, but is expected to serve other tetrameric proteins as well. The CHO cell data suggest that the BChE tetramer-organizing peptide can arise from a variety of proteins. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Alternative splicing and the evolution of phenotypic novelty.

    PubMed

    Bush, Stephen J; Chen, Lu; Tovar-Corona, Jaime M; Urrutia, Araxi O

    2017-02-05

    Alternative splicing, a mechanism of post-transcriptional RNA processing whereby a single gene can encode multiple distinct transcripts, has been proposed to underlie morphological innovations in multicellular organisms. Genes with developmental functions are enriched for alternative splicing events, suggestive of a contribution of alternative splicing to developmental programmes. The role of alternative splicing as a source of transcript diversification has previously been compared to that of gene duplication, with the relationship between the two extensively explored. Alternative splicing is reduced following gene duplication with the retention of duplicate copies higher for genes which were alternatively spliced prior to duplication. Furthermore, and unlike the case for overall gene number, the proportion of alternatively spliced genes has also increased in line with the evolutionary diversification of cell types, suggesting alternative splicing may contribute to the complexity of developmental programmes. Together these observations suggest a prominent role for alternative splicing as a source of functional innovation. However, it is unknown whether the proliferation of alternative splicing events indeed reflects a functional expansion of the transcriptome or instead results from weaker selection acting on larger species, which tend to have a higher number of cell types and lower population sizes.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).

  5. Alternative splicing and the evolution of phenotypic novelty

    PubMed Central

    Bush, Stephen J.; Chen, Lu; Tovar-Corona, Jaime M.

    2017-01-01

    Alternative splicing, a mechanism of post-transcriptional RNA processing whereby a single gene can encode multiple distinct transcripts, has been proposed to underlie morphological innovations in multicellular organisms. Genes with developmental functions are enriched for alternative splicing events, suggestive of a contribution of alternative splicing to developmental programmes. The role of alternative splicing as a source of transcript diversification has previously been compared to that of gene duplication, with the relationship between the two extensively explored. Alternative splicing is reduced following gene duplication with the retention of duplicate copies higher for genes which were alternatively spliced prior to duplication. Furthermore, and unlike the case for overall gene number, the proportion of alternatively spliced genes has also increased in line with the evolutionary diversification of cell types, suggesting alternative splicing may contribute to the complexity of developmental programmes. Together these observations suggest a prominent role for alternative splicing as a source of functional innovation. However, it is unknown whether the proliferation of alternative splicing events indeed reflects a functional expansion of the transcriptome or instead results from weaker selection acting on larger species, which tend to have a higher number of cell types and lower population sizes. This article is part of the themed issue ‘Evo-devo in the genomics era, and the origins of morphological diversity’. PMID:27994117

  6. Expression Atlas: gene and protein expression across multiple studies and organisms

    PubMed Central

    Tang, Y Amy; Bazant, Wojciech; Burke, Melissa; Fuentes, Alfonso Muñoz-Pomer; George, Nancy; Koskinen, Satu; Mohammed, Suhaib; Geniza, Matthew; Preece, Justin; Jarnuczak, Andrew F; Huber, Wolfgang; Stegle, Oliver; Brazma, Alvis; Petryszak, Robert

    2018-01-01

    Abstract Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions. PMID:29165655

  7. CoPub: a literature-based keyword enrichment tool for microarray data analysis.

    PubMed

    Frijters, Raoul; Heupers, Bart; van Beek, Pieter; Bouwhuis, Maurice; van Schaik, René; de Vlieg, Jacob; Polman, Jan; Alkema, Wynand

    2008-07-01

    Medline is a rich information source, from which links between genes and keywords describing biological processes, pathways, drugs, pathologies and diseases can be extracted. We developed a publicly available tool called CoPub that uses the information in the Medline database for the biological interpretation of microarray data. CoPub allows batch input of multiple human, mouse or rat genes and produces lists of keywords from several biomedical thesauri that are significantly correlated with the set of input genes. These lists link to Medline abstracts in which the co-occurring input genes and correlated keywords are highlighted. Furthermore, CoPub can graphically visualize differentially expressed genes and over-represented keywords in a network, providing detailed insight in the relationships between genes and keywords, and revealing the most influential genes as highly connected hubs. CoPub is freely accessible at http://services.nbic.nl/cgi-bin/copub/CoPub.pl.

  8. Culture adaptation of malaria parasites selects for convergent loss-of-function mutants.

    PubMed

    Claessens, Antoine; Affara, Muna; Assefa, Samuel A; Kwiatkowski, Dominic P; Conway, David J

    2017-01-24

    Cultured human pathogens may differ significantly from source populations. To investigate the genetic basis of laboratory adaptation in malaria parasites, clinical Plasmodium falciparum isolates were sampled from patients and cultured in vitro for up to three months. Genome sequence analysis was performed on multiple culture time point samples from six monoclonal isolates, and single nucleotide polymorphism (SNP) variants emerging over time were detected. Out of a total of five positively selected SNPs, four represented nonsense mutations resulting in stop codons, three of these in a single ApiAP2 transcription factor gene, and one in SRPK1. To survey further for nonsense mutants associated with culture, genome sequences of eleven long-term laboratory-adapted parasite strains were examined, revealing four independently acquired nonsense mutations in two other ApiAP2 genes, and five in Epac. No mutants of these genes exist in a large database of parasite sequences from uncultured clinical samples. This implicates putative master regulator genes in which multiple independent stop codon mutations have convergently led to culture adaptation, affecting most laboratory lines of P. falciparum. Understanding the adaptive processes should guide development of experimental models, which could include targeted gene disruption to adapt fastidious malaria parasite species to culture.

  9. Wide distribution of O157-antigen biosynthesis gene clusters in Escherichia coli.

    PubMed

    Iguchi, Atsushi; Shirai, Hiroki; Seto, Kazuko; Ooka, Tadasuke; Ogura, Yoshitoshi; Hayashi, Tetsuya; Osawa, Kayo; Osawa, Ro

    2011-01-01

    Most Escherichia coli O157-serogroup strains are classified as enterohemorrhagic E. coli (EHEC), which is known as an important food-borne pathogen for humans. They usually produce Shiga toxin (Stx) 1 and/or Stx2, and express H7-flagella antigen (or nonmotile). However, O157 strains that do not produce Stxs and express H antigens different from H7 are sometimes isolated from clinical and other sources. Multilocus sequence analysis revealed that these 21 O157:non-H7 strains tested in this study belong to multiple evolutionary lineages different from that of EHEC O157:H7 strains, suggesting a wide distribution of the gene set encoding the O157-antigen biosynthesis in multiple lineages. To gain insight into the gene organization and the sequence similarity of the O157-antigen biosynthesis gene clusters, we conducted genomic comparisons of the chromosomal regions (about 59 kb in each strain) covering the O-antigen gene cluster and its flanking regions between six O157:H7/non-H7 strains. Gene organization of the O157-antigen gene cluster was identical among O157:H7/non-H7 strains, but was divided into two distinct types at the nucleotide sequence level. Interestingly, distribution of the two types did not clearly follow the evolutionary lineages of the strains, suggesting that horizontal gene transfer of both types of O157-antigen gene clusters has occurred independently among E. coli strains. Additionally, detailed sequence comparison revealed that some positions of the repetitive extragenic palindromic (REP) sequences in the regions flanking the O-antigen gene clusters were coincident with possible recombination points. From these results, we conclude that the horizontal transfer of the O157-antigen gene clusters induced the emergence of multiple O157 lineages within E. coli and speculate that REP sequences may involve one of the driving forces for exchange and evolution of O-antigen loci.

  10. Refinement of light-responsive transcript lists using rice oligonucleotide arrays: evaluation of gene-redundancy.

    PubMed

    Jung, Ki-Hong; Dardick, Christopher; Bartley, Laura E; Cao, Peijian; Phetsom, Jirapa; Canlas, Patrick; Seo, Young-Su; Shultz, Michael; Ouyang, Shu; Yuan, Qiaoping; Frank, Bryan C; Ly, Eugene; Zheng, Li; Jia, Yi; Hsia, An-Ping; An, Kyungsook; Chou, Hui-Hsien; Rocke, David; Lee, Geun Cheol; Schnable, Patrick S; An, Gynheung; Buell, C Robin; Ronald, Pamela C

    2008-10-06

    Studies of gene function are often hampered by gene-redundancy, especially in organisms with large genomes such as rice (Oryza sativa). We present an approach for using transcriptomics data to focus functional studies and address redundancy. To this end, we have constructed and validated an inexpensive and publicly available rice oligonucleotide near-whole genome array, called the rice NSF45K array. We generated expression profiles for light- vs. dark-grown rice leaf tissue and validated the biological significance of the data by analyzing sources of variation and confirming expression trends with reverse transcription polymerase chain reaction. We examined trends in the data by evaluating enrichment of gene ontology terms at multiple false discovery rate thresholds. To compare data generated with the NSF45K array with published results, we developed publicly available, web-based tools (www.ricearray.org). The Oligo and EST Anatomy Viewer enables visualization of EST-based expression profiling data for all genes on the array. The Rice Multi-platform Microarray Search Tool facilitates comparison of gene expression profiles across multiple rice microarray platforms. Finally, we incorporated gene expression and biochemical pathway data to reduce the number of candidate gene products putatively participating in the eight steps of the photorespiration pathway from 52 to 10, based on expression levels of putatively functionally redundant genes. We confirmed the efficacy of this method to cope with redundancy by correctly predicting participation in photorespiration of a gene with five paralogs. Applying these methods will accelerate rice functional genomics.

  11. gsSKAT: Rapid gene set analysis and multiple testing correction for rare-variant association studies using weighted linear kernels.

    PubMed

    Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

    2017-05-01

    Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.

  12. A high resolution atlas of gene expression in the domestic sheep (Ovis aries)

    PubMed Central

    Farquhar, Iseabail L.; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G.; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C. Bruce; Freeman, Tom C.; Archibald, Alan L.; Hume, David A.

    2017-01-01

    Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of ‘guilt by association’ was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages. PMID:28915238

  13. A high resolution atlas of gene expression in the domestic sheep (Ovis aries).

    PubMed

    Clark, Emily L; Bush, Stephen J; McCulloch, Mary E B; Farquhar, Iseabail L; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G; Wu, Chunlei; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C Bruce; Freeman, Tom C; Summers, Kim M; Archibald, Alan L; Hume, David A

    2017-09-01

    Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of 'guilt by association' was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages.

  14. Microbial genotype-phenotype mapping by class association rule mining.

    PubMed

    Tamura, Makio; D'haeseleer, Patrik

    2008-07-01

    Microbial phenotypes are typically due to the concerted action of multiple gene functions, yet the presence of each gene may have only a weak correlation with the observed phenotype. Hence, it may be more appropriate to examine co-occurrence between sets of genes and a phenotype (multiple-to-one) instead of pairwise relations between a single gene and the phenotype. Here, we propose an efficient class association rule mining algorithm, netCAR, in order to extract sets of COGs (clusters of orthologous groups of proteins) associated with a phenotype from COG phylogenetic profiles and a phenotype profile. netCAR takes into account the phylogenetic co-occurrence graph between COGs to restrict hypothesis space, and uses mutual information to evaluate the biconditional relation. We examined the mining capability of pairwise and multiple-to-one association by using netCAR to extract COGs relevant to six microbial phenotypes (aerobic, anaerobic, facultative, endospore, motility and Gram negative) from 11,969 unique COG profiles across 155 prokaryotic organisms. With the same level of false discovery rate, multiple-to-one association can extract about 10 times more relevant COGs than one-to-one association. We also reveal various topologies of association networks among COGs (modules) from extracted multiple-to-one correlation rules relevant with the six phenotypes; including a well-connected network for motility, a star-shaped network for aerobic and intermediate topologies for the other phenotypes. netCAR outperforms a standard CAR mining algorithm, CARapriori, while requiring several orders of magnitude less computational time for extracting 3-COG sets. Source code of the Java implementation is available as Supplementary Material at the Bioinformatics online website, or upon request to the author. Supplementary data are available at Bioinformatics online.

  15. Pre-breeding for diversification of primary gene pool and genetic enhancement of grain legumes

    PubMed Central

    Sharma, Shivali; Upadhyaya, H. D.; Varshney, R. K.; Gowda, C. L. L.

    2013-01-01

    The narrow genetic base of cultivars coupled with low utilization of genetic resources are the major factors limiting grain legume production and productivity globally. Exploitation of new and diverse sources of variation is needed for the genetic enhancement of grain legumes. Wild relatives with enhanced levels of resistance/tolerance to multiple stresses provide important sources of genetic diversity for crop improvement. However, their exploitation for cultivar improvement is limited by cross-incompatibility barriers and linkage drags. Pre-breeding provides a unique opportunity, through the introgression of desirable genes from wild germplasm into genetic backgrounds readily used by the breeders with minimum linkage drag, to overcome this. Pre-breeding activities using promising landraces, wild relatives, and popular cultivars have been initiated at International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) to develop new gene pools in chickpea, pigeonpea, and groundnut with a high frequency of useful genes, wider adaptability, and a broad genetic base. The availability of molecular markers will greatly assist in reducing linkage drags and increasing the efficiency of introgression in pre-breeding programs. PMID:23970889

  16. Marker-assisted combination of major genes for pathogen resistance in potato.

    PubMed

    Gebhardt, C; Bellin, D; Henselewski, H; Lehmann, W; Schwarzfischer, J; Valkonen, J P T

    2006-05-01

    Closely linked PCR-based markers facilitate the tracing and combining of resistance factors that have been introgressed previously into cultivated potato from different sources. Crosses were performed to combine the Ry ( adg ) gene for extreme resistance to Potato virus Y (PVY) with the Gro1 gene for resistance to the root cyst nematode Globodera rostochiensis and the Rx1 gene for extreme resistance to Potato virus X (PVX), or with resistance to potato wart (Synchytrium endobioticum). Marker-assisted selection (MAS) using four PCR-based diagnostic assays was applied to 110 F1 hybrids resulting from four 2x by 4x cross-combinations. Thirty tetraploid plants having the appropriate marker combinations were selected and tested for presence of the corresponding resistance traits. All plants tested showed the expected resistant phenotype. Unexpectedly, the plants segregated for additional resistance to pathotypes 1, 2 and 6 of S. endobioticum, which was subsequently shown to be inherited from the PVY resistant parents of the crosses. The selected plants can be used as sources of multiple resistance traits in pedigree breeding and are available from a potato germplasm bank.

  17. Core clock, SUB1, and ABAR genes mediate flooding and drought responses via alternative splicing in soybean.

    PubMed

    Syed, Naeem H; Prince, Silvas J; Mutava, Raymond N; Patil, Gunvant; Li, Song; Chen, Wei; Babu, Valliyodan; Joshi, Trupti; Khan, Saad; Nguyen, Henry T

    2015-12-01

    Circadian clocks are a great evolutionary innovation and provide competitive advantage during the day/night cycle and under changing environmental conditions. The circadian clock mediates expression of a large proportion of genes in plants, achieving a harmonious relationship between energy metabolism, photosynthesis, and biotic and abiotic stress responses. Here it is shown that multiple paralogues of clock genes are present in soybean (Glycine max) and mediate flooding and drought responses. Differential expression of many clock and SUB1 genes was found under flooding and drought conditions. Furthermore, natural variation in the amplitude and phase shifts in PRR7 and TOC1 genes was also discovered under drought and flooding conditions, respectively. PRR3 exhibited flooding- and drought-specific splicing patterns and may work in concert with PRR7 and TOC1 to achieve energy homeostasis under flooding and drought conditions. Higher expression of TOC1 also coincides with elevated levels of abscisic acid (ABA) and variation in glucose levels in the morning and afternoon, indicating that this response to abiotic stress is mediated by ABA, endogenous sugar levels, and the circadian clock to fine-tune photosynthesis and energy utilization under stress conditions. It is proposed that the presence of multiple clock gene paralogues with variation in DNA sequence, phase, and period could be used to screen exotic germplasm to find sources for drought and flooding tolerance. Furthermore, fine tuning of multiple clock gene paralogues (via a genetic engineering approach) should also facilitate the development of flooding- and drought-tolerant soybean varieties. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  18. Gene expression profiling in multiple myeloma--reporting of entities, risk, and targets in clinical routine.

    PubMed

    Meissner, Tobias; Seckinger, Anja; Rème, Thierry; Hielscher, Thomas; Möhler, Thomas; Neben, Kai; Goldschmidt, Hartmut; Klein, Bernard; Hose, Dirk

    2011-12-01

    Multiple myeloma is an incurable malignant plasma cell disease characterized by survival ranging from several months to more than 15 years. Assessment of risk and underlying molecular heterogeneity can be excellently done by gene expression profiling (GEP), but its way into clinical routine is hampered by the lack of an appropriate reporting tool and the integration with other prognostic factors into a single "meta" risk stratification. The GEP-report (GEP-R) was built as an open-source software developed in R for gene expression reporting in clinical practice using Affymetrix microarrays. GEP-R processes new samples by applying a documentation-by-value strategy to the raw data to be able to assign thresholds and grouping algorithms defined on a reference cohort of 262 patients with multiple myeloma. Furthermore, we integrated expression-based and conventional prognostic factors within one risk stratification (HM-metascore). The GEP-R comprises (i) quality control, (ii) sample identity control, (iii) biologic classification, (iv) risk stratification, and (v) assessment of target genes. The resulting HM-metascore is defined as the sum over the weighted factors gene expression-based risk-assessment (UAMS-, IFM-score), proliferation, International Staging System (ISS) stage, t(4;14), and expression of prognostic target genes (AURKA, IGF1R) for which clinical grade inhibitors exist. The HM-score delineates three significantly different groups of 13.1%, 72.1%, and 14.7% of patients with a 6-year survival rate of 89.3%, 60.6%, and 18.6%, respectively. GEP reporting allows prospective assessment of risk and target gene expression and integration of current prognostic factors in clinical routine, being customizable about novel parameters or other cancer entities. ©2011 AACR.

  19. Infection by Rhodococcus fascians maintains cotyledons as a sink tissue for the pathogen

    PubMed Central

    Dhandapani, Pragatheswari; Song, Jiancheng; Novak, Ondrej

    2017-01-01

    Background and Aims Pisum sativum L. (pea) seed is a source of carbohydrate and protein for the developing plant. By studying pea seeds inoculated by the cytokinin-producing bacterium, Rhodococcus fascians, we sought to determine the impact of both an epiphytic (avirulent) strain and a pathogenic strain on source–sink activity within the cotyledons during and following germination. Methods Bacterial spread was monitored microscopically, and real-time reverse transcription–quantitative PCR was used to determine the expression of cytokinin biosynthesis, degradation and response regulator gene family members, along with expression of family members of SWEET, SUT, CWINV and AAP genes – gene families identified initially in pea by transcriptomic analysis. The endogenous cytokinin content was also determined. Key Results The cotyledons infected by the virulent strain remained intact and turned green, while multiple shoots were formed and root growth was reduced. The epiphytic strain had no such marked impact. Isopentenyl adenine was elevated in the cotyledons infected by the virulent strain. Strong expression of RfIPT, RfLOG and RfCKX was detected in the cotyledons infected by the virulent strain throughout the experiment, with elevated expression also observed for PsSWEET, PsSUT and PsINV gene family members. The epiphytic strain had some impact on the expression of these genes, especially at the later stages of reserve mobilization from the cotyledons. Conclusions The pathogenic strain retained the cotyledons as a sink tissue for the pathogen rather than the cotyledon converting completely to a source tissue for the germinating plant. We suggest that the interaction of cytokinins, CWINVs and SWEETs may lead to the loss of apical dominance and the appearance of multiple shoots. PMID:27864224

  20. CRISPR/Cas9-mediated heterozygous knockout of the autism gene CHD8 and characterization of its transcriptional networks in neurodevelopment.

    PubMed

    Wang, Ping; Lin, Mingyan; Pedrosa, Erika; Hrabovsky, Anastasia; Zhang, Zheng; Guo, Wenjun; Lachman, Herbert M; Zheng, Deyou

    2015-01-01

    Disruptive mutation in the CHD8 gene is one of the top genetic risk factors in autism spectrum disorders (ASDs). Previous analyses of genome-wide CHD8 occupancy and reduced expression of CHD8 by shRNA knockdown in committed neural cells showed that CHD8 regulates multiple cell processes critical for neural functions, and its targets are enriched with ASD-associated genes. To further understand the molecular links between CHD8 functions and ASD, we have applied the CRISPR/Cas9 technology to knockout one copy of CHD8 in induced pluripotent stem cells (iPSCs) to better mimic the loss-of-function status that would exist in the developing human embryo prior to neuronal differentiation. We then carried out transcriptomic and bioinformatic analyses of neural progenitors and neurons derived from the CHD8 mutant iPSCs. Transcriptome profiling revealed that CHD8 hemizygosity (CHD8 (+/-)) affected the expression of several thousands of genes in neural progenitors and early differentiating neurons. The differentially expressed genes were enriched for functions of neural development, β-catenin/Wnt signaling, extracellular matrix, and skeletal system development. They also exhibited significant overlap with genes previously associated with autism and schizophrenia, as well as the downstream transcriptional targets of multiple genes implicated in autism. Providing important insight into how CHD8 mutations might give rise to macrocephaly, we found that seven of the twelve genes associated with human brain volume or head size by genome-wide association studies (e.g., HGMA2) were dysregulated in CHD8 (+/-) neural progenitors or neurons. We have established a renewable source of CHD8 (+/-) iPSC lines that would be valuable for investigating the molecular and cellular functions of CHD8. Transcriptomic profiling showed that CHD8 regulates multiple genes implicated in ASD pathogenesis and genes associated with brain volume.

  1. Gene order in rosid phylogeny, inferred from pairwise syntenies among extant genomes

    PubMed Central

    2012-01-01

    Background Ancestral gene order reconstruction for flowering plants has lagged behind developments in yeasts, insects and higher animals, because of the recency of widespread plant genome sequencing, sequencers' embargoes on public data use, paralogies due to whole genome duplication (WGD) and fractionation of undeleted duplicates, extensive paralogy from other sources, and the computational cost of existing methods. Results We address these problems, using the gene order of four core eudicot genomes (cacao, castor bean, papaya and grapevine) that have escaped any recent WGD events, and two others (poplar and cucumber) that descend from independent WGDs, in inferring the ancestral gene order of the rosid clade and those of its main subgroups, the fabids and malvids. We improve and adapt techniques including the OMG method for extracting large, paralogy-free, multiple orthologies from conflated pairwise synteny data among the six genomes and the PATHGROUPS approach for ancestral gene order reconstruction in a given phylogeny, where some genomes may be descendants of WGD events. We use the gene order evidence to evaluate the hypothesis that the order Malpighiales belongs to the malvids rather than as traditionally assigned to the fabids. Conclusions Gene orders of ancestral eudicot species, involving 10,000 or more genes can be reconstructed in an efficient, parsimonious and consistent way, despite paralogies due to WGD and other processes. Pairwise genomic syntenies provide appropriate input to a parameter-free procedure of multiple ortholog identification followed by gene-order reconstruction in solving instances of the "small phylogeny" problem. PMID:22759433

  2. Identification of blood meal sources of Lutzomyia longipalpis using polymerase chain reaction-restriction fragment length polymorphism analysis of the cytochrome B gene

    PubMed Central

    Soares, Vítor Yamashiro Rocha; da Silva, Jailthon Carlos; da Silva, Kleverton Ribeiro; Cruz, Maria do Socorro Pires e; Santos, Marcos Pérsio Dantas; Ribolla, Paulo Eduardo Martins; Alonso, Diego Peres; Coelho, Luiz Felipe Leomil; Costa, Dorcas Lamounier; Costa, Carlos Henrique Nery

    2014-01-01

    An analysis of the dietary content of haematophagous insects can provide important information about the transmission networks of certain zoonoses. The present study evaluated the potential of polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis of the mitochondrial cytochrome B (cytb) gene to differentiate between vertebrate species that were identified as possible sources of sandfly meals. The complete cytb gene sequences of 11 vertebrate species available in the National Center for Biotechnology Information database were digested with Aci I, Alu I, Hae III and Rsa I restriction enzymes in silico using Restriction Mapper software. The cytb gene fragment (358 bp) was amplified from tissue samples of vertebrate species and the dietary contents of sandflies and digested with restriction enzymes. Vertebrate species presented a restriction fragment profile that differed from that of other species, with the exception of Canis familiaris and Cerdocyon thous. The 358 bp fragment was identified in 76 sandflies. Of these, 10 were evaluated using the restriction enzymes and the food sources were predicted for four: Homo sapiens (1), Bos taurus (1) and Equus caballus (2). Thus, the PCR-RFLP technique could be a potential method for identifying the food sources of arthropods. However, some points must be clarified regarding the applicability of the method, such as the extent of DNA degradation through intestinal digestion, the potential for multiple sources of blood meals and the need for greater knowledge regarding intraspecific variations in mtDNA. PMID:24821056

  3. Neurocarta: aggregating and sharing disease-gene relations for the neurosciences.

    PubMed

    Portales-Casamar, Elodie; Ch'ng, Carolyn; Lui, Frances; St-Georges, Nicolas; Zoubarev, Anton; Lai, Artemis Y; Lee, Mark; Kwok, Cathy; Kwok, Willie; Tseng, Luchia; Pavlidis, Paul

    2013-02-26

    Understanding the genetic basis of diseases is key to the development of better diagnoses and treatments. Unfortunately, only a small fraction of the existing data linking genes to phenotypes is available through online public resources and, when available, it is scattered across multiple access tools. Neurocarta is a knowledgebase that consolidates information on genes and phenotypes across multiple resources and allows tracking and exploring of the associations. The system enables automatic and manual curation of evidence supporting each association, as well as user-enabled entry of their own annotations. Phenotypes are recorded using controlled vocabularies such as the Disease Ontology to facilitate computational inference and linking to external data sources. The gene-to-phenotype associations are filtered by stringent criteria to focus on the annotations most likely to be relevant. Neurocarta is constantly growing and currently holds more than 30,000 lines of evidence linking over 7,000 genes to 2,000 different phenotypes. Neurocarta is a one-stop shop for researchers looking for candidate genes for any disorder of interest. In Neurocarta, they can review the evidence linking genes to phenotypes and filter out the evidence they're not interested in. In addition, researchers can enter their own annotations from their experiments and analyze them in the context of existing public annotations. Neurocarta's in-depth annotation of neurodevelopmental disorders makes it a unique resource for neuroscientists working on brain development.

  4. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity

    PubMed Central

    Wang, Yupeng; Tang, Haibao; DeBarry, Jeremy D.; Tan, Xu; Li, Jingping; Wang, Xiyin; Lee, Tae-ho; Jin, Huizhe; Marler, Barry; Guo, Hui; Kissinger, Jessica C.; Paterson, Andrew H.

    2012-01-01

    MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/. PMID:22217600

  5. Identification and sequence analyses of novel lipase encoding novel thermophillic bacilli isolated from Armenian geothermal springs.

    PubMed

    Shahinyan, Grigor; Margaryan, Armine; Panosyan, Hovik; Trchounian, Armen

    2017-05-02

    Among the huge diversity of thermophilic bacteria mainly bacilli have been reported as active thermostable lipase producers. Geothermal springs serve as the main source for isolation of thermostable lipase producing bacilli. Thermostable lipolytic enzymes, functioning in the harsh conditions, have promising applications in processing of organic chemicals, detergent formulation, synthesis of biosurfactants, pharmaceutical processing etc. In order to study the distribution of lipase-producing thermophilic bacilli and their specific lipase protein primary structures, three lipase producers from different genera were isolated from mesothermal (27.5-70 °C) springs distributed on the territory of Armenia and Nagorno Karabakh. Based on phenotypic characteristics and 16S rRNA gene sequencing the isolates were identified as Geobacillus sp., Bacillus licheniformis and Anoxibacillus flavithermus strains. The lipase genes of isolates were sequenced by using initially designed primer sets. Multiple alignments generated from primary structures of the lipase proteins and annotated lipase protein sequences, conserved regions analysis and amino acid composition have illustrated the similarity (98-99%) of the lipases with true lipases (family I) and GDSL esterase family (family II). A conserved sequence block that determines the thermostability has been identified in the multiple alignments of the lipase proteins. The results are spreading light on the lipase producing bacilli distribution in geothermal springs in Armenia and Nagorno Karabakh. Newly isolated bacilli strains could be prospective source for thermostable lipases and their genes.

  6. Biomine: predicting links between biological entities using network models of heterogeneous databases.

    PubMed

    Eronen, Lauri; Toivonen, Hannu

    2012-06-06

    Biological databases contain large amounts of data concerning the functions and associations of genes and proteins. Integration of data from several such databases into a single repository can aid the discovery of previously unknown connections spanning multiple types of relationships and databases. Biomine is a system that integrates cross-references from several biological databases into a graph model with multiple types of edges, such as protein interactions, gene-disease associations and gene ontology annotations. Edges are weighted based on their type, reliability, and informativeness. We present Biomine and evaluate its performance in link prediction, where the goal is to predict pairs of nodes that will be connected in the future, based on current data. In particular, we formulate protein interaction prediction and disease gene prioritization tasks as instances of link prediction. The predictions are based on a proximity measure computed on the integrated graph. We consider and experiment with several such measures, and perform a parameter optimization procedure where different edge types are weighted to optimize link prediction accuracy. We also propose a novel method for disease-gene prioritization, defined as finding a subset of candidate genes that cluster together in the graph. We experimentally evaluate Biomine by predicting future annotations in the source databases and prioritizing lists of putative disease genes. The experimental results show that Biomine has strong potential for predicting links when a set of selected candidate links is available. The predictions obtained using the entire Biomine dataset are shown to clearly outperform ones obtained using any single source of data alone, when different types of links are suitably weighted. In the gene prioritization task, an established reference set of disease-associated genes is useful, but the results show that under favorable conditions, Biomine can also perform well when no such information is available.The Biomine system is a proof of concept. Its current version contains 1.1 million entities and 8.1 million relations between them, with focus on human genetics. Some of its functionalities are available in a public query interface at http://biomine.cs.helsinki.fi, allowing searching for and visualizing connections between given biological entities.

  7. Integron-Associated DfrB4, a Previously Uncharacterized Member of the Trimethoprim-Resistant Dihydrofolate Reductase B Family, Is a Clinically Identified Emergent Source of Antibiotic Resistance.

    PubMed

    Toulouse, Jacynthe L; Edens, Thaddeus J; Alejaldre, Lorea; Manges, Amee R; Pelletier, Joelle N

    2017-05-01

    Whole-genome sequencing of trimethoprim-resistant Escherichia coli clinical isolates identified a member of the trimethoprim-resistant type II dihydrofolate reductase gene family ( dfrB ). The dfrB4 gene was located within a class I integron flanked by multiple resistance genes. This arrangement was previously reported in a 130.6-kb multiresistance plasmid. The DfrB4 protein conferred a >2,000-fold increased trimethoprim resistance on overexpression in E. coli Our results are consistent with the finding that dfrB4 contributes to clinical trimethoprim resistance. Copyright © 2017 American Society for Microbiology.

  8. Heterogeneous data fusion for brain tumor classification.

    PubMed

    Metsis, Vangelis; Huang, Heng; Andronesi, Ovidiu C; Makedon, Fillia; Tzika, Aria

    2012-10-01

    Current research in biomedical informatics involves analysis of multiple heterogeneous data sets. This includes patient demographics, clinical and pathology data, treatment history, patient outcomes as well as gene expression, DNA sequences and other information sources such as gene ontology. Analysis of these data sets could lead to better disease diagnosis, prognosis, treatment and drug discovery. In this report, we present a novel machine learning framework for brain tumor classification based on heterogeneous data fusion of metabolic and molecular datasets, including state-of-the-art high-resolution magic angle spinning (HRMAS) proton (1H) magnetic resonance spectroscopy and gene transcriptome profiling, obtained from intact brain tumor biopsies. Our experimental results show that our novel framework outperforms any analysis using individual dataset.

  9. Semantic integration of data on transcriptional regulation

    PubMed Central

    Baitaluk, Michael; Ponomarenko, Julia

    2010-01-01

    Motivation: Experimental and predicted data concerning gene transcriptional regulation are distributed among many heterogeneous sources. However, there are no resources to integrate these data automatically or to provide a ‘one-stop shop’ experience for users seeking information essential for deciphering and modeling gene regulatory networks. Results: IntegromeDB, a semantic graph-based ‘deep-web’ data integration system that automatically captures, integrates and manages publicly available data concerning transcriptional regulation, as well as other relevant biological information, is proposed in this article. The problems associated with data integration are addressed by ontology-driven data mapping, multiple data annotation and heterogeneous data querying, also enabling integration of the user's data. IntegromeDB integrates over 100 experimental and computational data sources relating to genomics, transcriptomics, genetics, and functional and interaction data concerning gene transcriptional regulation in eukaryotes and prokaryotes. Availability: IntegromeDB is accessible through the integrated research environment BiologicalNetworks at http://www.BiologicalNetworks.org Contact: baitaluk@sdsc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20427517

  10. Semantic integration of data on transcriptional regulation.

    PubMed

    Baitaluk, Michael; Ponomarenko, Julia

    2010-07-01

    Experimental and predicted data concerning gene transcriptional regulation are distributed among many heterogeneous sources. However, there are no resources to integrate these data automatically or to provide a 'one-stop shop' experience for users seeking information essential for deciphering and modeling gene regulatory networks. IntegromeDB, a semantic graph-based 'deep-web' data integration system that automatically captures, integrates and manages publicly available data concerning transcriptional regulation, as well as other relevant biological information, is proposed in this article. The problems associated with data integration are addressed by ontology-driven data mapping, multiple data annotation and heterogeneous data querying, also enabling integration of the user's data. IntegromeDB integrates over 100 experimental and computational data sources relating to genomics, transcriptomics, genetics, and functional and interaction data concerning gene transcriptional regulation in eukaryotes and prokaryotes. IntegromeDB is accessible through the integrated research environment BiologicalNetworks at http://www.BiologicalNetworks.org baitaluk@sdsc.edu Supplementary data are available at Bioinformatics online.

  11. miRNA-Mediated Relationships between Cis-SNP Genotypes and Transcript Intensities in Lymphocyte Cell Lines

    PubMed Central

    Zhang, Wensheng; Edwards, Andrea; Zhu, Dongxiao; Flemington, Erik K.; Deininger, Prescott; Zhang, Kun

    2012-01-01

    In metazoans, miRNAs regulate gene expression primarily through binding to target sites in the 3′ UTRs (untranslated regions) of messenger RNAs (mRNAs). Cis-acting variants within, or close to, a gene are crucial in explaining the variability of gene expression measures. Single nucleotide polymorphisms (SNPs) in the 3′ UTRs of genes can affect the base-pairing between miRNAs and mRNAs, and hence disrupt existing target sites (in the reference sequence) or create novel target sites, suggesting a possible mechanism for cis regulation of gene expression. Moreover, because the alleles of different SNPs within a DNA sequence of limited length tend to be in strong linkage disequilibrium (LD), we hypothesize the variants of miRNA target sites caused by SNPs potentially function as bridges linking the documented cis-SNP markers to the expression of the associated genes. A large-scale analysis was herein performed to test this hypothesis. By systematically integrating multiple latest information sources, we found 21 significant gene-level SNP-involved miRNA-mediated post-transcriptional regulation modules (SNP-MPRMs) in the form of SNP-miRNA-mRNA triplets in lymphocyte cell lines for the CEU and YRI populations. Among the cognate genes, six including ALG8, DGKE, GNA12, KLF11, LRPAP1, and MMAB are related to multiple genetic diseases such as depressive disorder and Type-II diabetes. Furthermore, we found that ∼35% of the documented transcript intensity-related cis-SNPs (∼950) in a recent publication are identical to, or in significant linkage disequilibrium (LD) (p<0.01) with, one or multiple SNPs located in miRNA target sites. Based on these associations (or identities), 69 significant exon-level SNP-MPRMs and 12 disease genes were further determined for two populations. These results provide concrete in silico evidence for the proposed hypothesis. The discovered modules warrant additional follow-up in independent laboratory studies. PMID:22348086

  12. Cluster Analysis of Campylobacter jejuni Genotypes Isolated from Small and Medium-Sized Mammalian Wildlife and Bovine Livestock from Ontario Farms.

    PubMed

    Viswanathan, M; Pearl, D L; Taboada, E N; Parmley, E J; Mutschall, S K; Jardine, C M

    2017-05-01

    Using data collected from a cross-sectional study of 25 farms (eight beef, eight swine and nine dairy) in 2010, we assessed clustering of molecular subtypes of C. jejuni based on a Campylobacter-specific 40 gene comparative genomic fingerprinting assay (CGF40) subtypes, using unweighted pair-group method with arithmetic mean (UPGMA) analysis, and multiple correspondence analysis. Exact logistic regression was used to determine which genes differentiate wildlife and livestock subtypes in our study population. A total of 33 bovine livestock (17 beef and 16 dairy), 26 wildlife (20 raccoon (Procyon lotor), five skunk (Mephitis mephitis) and one mouse (Peromyscus spp.) C. jejuni isolates were subtyped using CGF40. Dendrogram analysis, based on UPGMA, showed distinct branches separating bovine livestock and mammalian wildlife isolates. Furthermore, two-dimensional multiple correspondence analysis was highly concordant with dendrogram analysis showing clear differentiation between livestock and wildlife CGF40 subtypes. Based on multilevel logistic regression models with a random intercept for farm of origin, we found that isolates in general, and raccoons more specifically, were significantly more likely to be part of the wildlife branch. Exact logistic regression conducted gene by gene revealed 15 genes that were predictive of whether an isolate was of wildlife or bovine livestock isolate origin. Both multiple correspondence analysis and exact logistic regression revealed that in most cases, the presence of a particular gene (13 of 15) was associated with an isolate being of livestock rather than wildlife origin. In conclusion, the evidence gained from dendrogram analysis, multiple correspondence analysis and exact logistic regression indicates that mammalian wildlife carry CGF40 subtypes of C. jejuni distinct from those carried by bovine livestock. Future studies focused on source attribution of C. jejuni in human infections will help determine whether wildlife transmit Campylobacter jejuni directly to humans. © 2016 Blackwell Verlag GmbH.

  13. Gene flow in environmental Legionella pneumophila leads to genetic and pathogenic heterogeneity within a Legionnaires' disease outbreak.

    PubMed

    McAdam, Paul R; Vander Broek, Charles W; Lindsay, Diane S J; Ward, Melissa J; Hanson, Mary F; Gillies, Michael; Watson, Mick; Stevens, Joanne M; Edwards, Giles F; Fitzgerald, J Ross

    2014-01-01

    Legionnaires' disease is a severe form of pneumonia caused by the environmental bacterium Legionella pneumophila. Outbreaks commonly affect people with known risk factors, but the genetic and pathogenic complexity of L. pneumophila within an outbreak is not well understood. Here, we investigate the etiology of the major Legionnaires' disease outbreak that occurred in Edinburgh, UK, in 2012, by examining the evolutionary history, genome content, and virulence of L. pneumophila clinical isolates. Our high resolution genomic approach reveals that the outbreak was caused by multiple genetic subtypes of L. pneumophila, the majority of which had diversified from a single progenitor through mutation, recombination, and horizontal gene transfer within an environmental reservoir prior to release. In addition, we discover that some patients were infected with multiple L. pneumophila subtypes, a finding which can affect the certainty of source attribution. Importantly, variation in the complement of type IV secretion systems encoded by different genetic subtypes correlates with virulence in a Galleria mellonella model of infection, revealing variation in pathogenic potential among the outbreak source population of L. pneumophila. Taken together, our study indicates previously cryptic levels of pathogen heterogeneity within a Legionnaires' disease outbreak, a discovery that impacts on source attribution for future outbreak investigations. Furthermore, our data suggest that in addition to host immune status, pathogen diversity may be an important influence on the clinical outcome of individual outbreak infections.

  14. Genome sequence of a novel multiple antibiotic resistant member of Erysipelotrichaceae family isolated from a swine manure storage pit

    USDA-ARS?s Scientific Manuscript database

    The swine gastro intestinal (GI) tract and stored manure may serve as reservoirs of antibiotic resistance genes as well as sources of novel bacteria. We report the draft genome sequence of “Cottaibacterium suis” strain MTC7, a novel antibiotic resistant bacterium. The strain was isolated from a swin...

  15. Quantitative trait locus mapping of drought and salt tolerance in as introgressed recombinant inbred line population of upland cotton under the greenhouse and feild conditions

    USDA-ARS?s Scientific Manuscript database

    Drought and salt tolerances are complex traits and controlled by multiple genes, environmental factors and their interactions. Drought and salt stresses can result in more than 50% yield loss in Upland cotton (Gossypium hirsutum L.). G. barbadense L. (the source of Pima cotton) carries desirable tra...

  16. GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis.

    PubMed

    Zheng, Qi; Wang, Xiu-Jie

    2008-07-01

    Gene Ontology (GO) analysis has become a commonly used approach for functional studies of large-scale genomic or transcriptomic data. Although there have been a lot of software with GO-related analysis functions, new tools are still needed to meet the requirements for data generated by newly developed technologies or for advanced analysis purpose. Here, we present a Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them. GOEAST also provides rigorous statistical tests to enhance the reliability of analysis results. GOEAST is freely accessible at http://omicslab.genetics.ac.cn/GOEAST/

  17. Phenome-driven disease genetics prediction toward drug discovery.

    PubMed

    Chen, Yang; Li, Li; Zhang, Guo-Qiang; Xu, Rong

    2015-06-15

    Discerning genetic contributions to diseases not only enhances our understanding of disease mechanisms, but also leads to translational opportunities for drug discovery. Recent computational approaches incorporate disease phenotypic similarities to improve the prediction power of disease gene discovery. However, most current studies used only one data source of human disease phenotype. We present an innovative and generic strategy for combining multiple different data sources of human disease phenotype and predicting disease-associated genes from integrated phenotypic and genomic data. To demonstrate our approach, we explored a new phenotype database from biomedical ontologies and constructed Disease Manifestation Network (DMN). We combined DMN with mimMiner, which was a widely used phenotype database in disease gene prediction studies. Our approach achieved significantly improved performance over a baseline method, which used only one phenotype data source. In the leave-one-out cross-validation and de novo gene prediction analysis, our approach achieved the area under the curves of 90.7% and 90.3%, which are significantly higher than 84.2% (P < e(-4)) and 81.3% (P < e(-12)) for the baseline approach. We further demonstrated that our predicted genes have the translational potential in drug discovery. We used Crohn's disease as an example and ranked the candidate drugs based on the rank of drug targets. Our gene prediction approach prioritized druggable genes that are likely to be associated with Crohn's disease pathogenesis, and our rank of candidate drugs successfully prioritized the Food and Drug Administration-approved drugs for Crohn's disease. We also found literature evidence to support a number of drugs among the top 200 candidates. In summary, we demonstrated that a novel strategy combining unique disease phenotype data with system approaches can lead to rapid drug discovery. nlp. edu/public/data/DMN © The Author 2015. Published by Oxford University Press.

  18. Molecular Diversity of Bacteroidales in Fecal and Environmental Samples and Swine-Associated Subpopulations

    PubMed Central

    Lamendella, Regina; Li, Kent C.; Oerther, Daniel

    2013-01-01

    Several swine-specific microbial source tracking methods are based on PCR assays targeting Bacteroidales 16S rRNA gene sequences. The limited application of these assays can be explained by the poor understanding of their molecular diversity in fecal sources and environmental waters. In order to address this, we studied the diversity of 9,340 partial (>600 bp in length) Bacteroidales 16S rRNA gene sequences from 13 fecal sources and nine feces-contaminated watersheds. The compositions of major Bacteroidales populations were analyzed to determine which host and environmental sequences were contributing to each group. This information allowed us to identify populations which were both exclusive to swine fecal sources and detected in swine-contaminated waters. Phylogenetic and diversity analyses revealed that some markers previously believed to be highly specific to swine populations are shared by multiple hosts, potentially explaining the cross-amplification signals obtained with nontargeted hosts. These data suggest that while many Bacteroidales populations are cosmopolitan, others exhibit a preferential host distribution and may be able to survive different environmental conditions. This study further demonstrates the importance of elucidating the diversity patterns of targeted bacterial groups to develop more inclusive fecal source tracking applications. PMID:23160126

  19. Systematic gene deletions evidences that laccases are involved in several stages of wood degradation in the filamentous fungus Podospora anserina.

    PubMed

    Xie, Ning; Chapeland-Leclerc, Florence; Silar, Philippe; Ruprich-Robert, Gwenaël

    2014-01-01

    Transformation of plant biomass into biofuels may supply environmentally friendly alternative biological sources of energy. Laccases are supposed to be involved in the lysis of lignin, a prerequisite step for efficient breakdown of cellulose into fermentable sugars. The role in development and plant biomass degradation of the nine canonical laccases belonging to three different subfamilies and one related multicopper oxidase of the Ascomycota fungus Podospora anserina was investigated by targeted gene deletion. The 10 genes were inactivated singly, and multiple mutants were constructed by genetic crosses. lac6(Δ), lac8(Δ) and mco(Δ) mutants were significantly reduced in their ability to grow on lignin-containing materials, but also on cellulose and plastic. Furthermore, lac8(Δ), lac7(Δ), mco(Δ) and lac6(Δ) mutants were defective towards resistance to phenolic substrates and H2 O2 , which may also impact lignocellulose breakdown. Double and multiple mutants were generally more affected than single mutants, evidencing redundancy of function among laccases. Our study provides the first genetic evidences that laccases are major actors of wood utilization in a fungus and that they have multiple roles during this process apart from participation in lignin lysis. © 2013 Society for Applied Microbiology and John Wiley & Sons Ltd.

  20. Origin and Reticulate Evolutionary Process of Wheatgrass Elymus trachycaulus (Triticeae: Poaceae)

    PubMed Central

    Zuo, Hongwei; Wu, Panpan; Wu, Dexiang; Sun, Genlou

    2015-01-01

    To study origin and evolutionary dynamics of tetraploid Elymus trachycaulus that has been cytologically defined as containing StH genomes, thirteen accessions of E. trachycaulus were analyzed using two low-copy nuclear gene Pepc (phosphoenolpyruvate carboxylase) and Rpb2 (the second largest subunit of RNA polymerase II), and one chloroplast region trnL–trnF (spacer between the tRNA Leu (UAA) gene and the tRNA-Phe (GAA) gene). Our chloroplast data indicated that Pseudoroegneria (St genome) was the maternal donor of E. trachycaulus. Rpb2 data indicated that the St genome in E. trachycaulus was originated from either P. strigosa, P. stipifolia, P. spicata or P. geniculate. The Hordeum (H genome)-like sequences of E. trachycaulus are polyphyletic in the Pepc tree, suggesting that the H genome in E. trachycaulus was contributed by multiple sources, whether due to multiple origins or introgression resulting from subsequent hybridization. Failure to recovering St copy of Pepc sequence in most accessions of E. trachycaulus might be caused by genome convergent evolution in allopolyploids. Multiple copies of H-like Pepc sequence from each accession with relative large deletions and insertions might be caused by either instability of Pepc sequence in H- genome or incomplete concerted evolution. Our results highlighted complex evolutionary history of E. trachycaulus. PMID:25946188

  1. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    PubMed

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Scuba: scalable kernel-based gene prioritization.

    PubMed

    Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio

    2018-01-25

    The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .

  3. Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data.

    PubMed

    Paisitkriangkrai, Sakrapee; Quek, Kelly; Nievergall, Eva; Jabbour, Anissa; Zannettino, Andrew; Kok, Chung Hoow

    2018-06-07

    Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse .

  4. Screening of the Enterocin-Encoding Genes and Antimicrobial Activity in Enterococcus Species.

    PubMed

    Ogaki, Mayara Baptistucci; Rocha, Katia Real; Terra, MÁrcia Regina; Furlaneto, MÁrcia Cristina; Maia, Luciana Furlaneto

    2016-06-28

    In the current study, a total of 135 enterococci strains from different sources were screened for the presence of the enterocin-encoding genes entA, entP, entB, entL50A, and entL50B. The enterocin genes were present at different frequencies, with entA occurring the most frequently, followed by entP and entB; entL50A and L50B were not detected. The occurrence of single enterocin genes was higher than the occurrence of multiple enterocin gene combinations. The 80 isolates that harbor at least one enterocin-encoding gene (denoted "Gene(+) strains") were screened for antimicrobial activity. A total of 82.5% of the Gene(+) strains inhibited at least one of the indicator strains, and the isolates harboring multiple enterocin-encoding genes inhibited a larger number of indicator strains than isolates harboring a single gene. The indicator strains that exhibited growth inhibition included Listeria innocua strain CLIP 12612 (ATCC BAA-680), Listeria monocytogenes strain CDC 4555, Enterococcus faecalis ATCC 29212, Staphylococcus aureus ATCC 25923, S. aureus ATCC 29213, S. aureus ATCC 6538, Salmonella enteritidis ATCC 13076, Salmonella typhimurium strain UK-1 (ATCC 68169), and Escherichia coli BAC 49LT ETEC. Inhibition due to either bacteriophage lysis or cytolysin activity was excluded. The growth inhibition of antilisterial Gene+ strains was further tested under different culture conditions. Among the culture media formulations, the MRS agar medium supplemented with 2% (w/v) yeast extract was the best solidified medium for enterocin production. Our findings extend the current knowledge of enterocin-producing enterococci, which may have potential applications as biopreservatives in the food industry due to their capability of controlling food spoilage pathogens.

  5. Genomics of Natural Populations: How Differentially Expressed Genes Shape the Evolution of Chromosomal Inversions in Drosophila pseudoobscura

    PubMed Central

    Fuller, Zachary L.; Haynes, Gwilym D.; Richards, Stephen; Schaeffer, Stephen W.

    2016-01-01

    Chromosomal rearrangements can shape the structure of genetic variation in the genome directly through alteration of genes at breakpoints or indirectly by holding combinations of genetic variants together due to reduced recombination. The third chromosome of Drosophila pseudoobscura is a model system to test hypotheses about how rearrangements are established in populations because its third chromosome is polymorphic for >30 gene arrangements that were generated by a series of overlapping inversion mutations. Circumstantial evidence has suggested that these gene arrangements are selected. Despite the expected homogenizing effects of extensive gene flow, the frequencies of arrangements form gradients or clines in nature, which have been stable since the system was first described >80 years ago. Furthermore, multiple arrangements exist at appreciable frequencies across several ecological niches providing the opportunity for heterokaryotypes to form. In this study, we tested whether genes are differentially expressed among chromosome arrangements in first instar larvae, adult females and males. In addition, we asked whether transcriptional patterns in heterokaryotypes are dominant, semidominant, overdominant, or underdominant. We find evidence for a significant abundance of differentially expressed genes across the inverted regions of the third chromosome, including an enrichment of genes involved in sensory perception for males. We find the majority of loci show additivity in heterokaryotypes. Our results suggest that multiple genes have expression differences among arrangements that were either captured by the original inversion mutation or accumulated after it reached polymorphic frequencies, providing a potential source of genetic variation for selection to act upon. These data suggest that the inversions are favored because of their indirect effect of recombination suppression that has held different combinations of differentially expressed genes together in the various gene arrangement backgrounds. PMID:27401754

  6. Assessment of fecal pollution sources in a small northern-plains watershed using PCR and phylogenetic analyses of Bacteroidetes 16S rRNA gene

    USGS Publications Warehouse

    Lamendella, R.; Domingo, J.W.S.; Oerther, D.B.; Vogel, J.R.; Stoeckel, D.M.

    2007-01-01

    We evaluated the efficacy, sensitivity, host-specificity, and spatial/temporal dynamics of human- and ruminant-specific 16S rRNA gene Bacteroidetes markers used to assess the sources of fecal pollution in a fecally impacted watershed. Phylogenetic analyses of 1271 fecal and environmental 16S rRNA gene clones were also performed to study the diversity of Bacteroidetes in this watershed. The host-specific assays indicated that ruminant feces were present in 28-54% of the water samples and in all sampling seasons, with increasing frequency in downstream sites. The human-targeted assays indicated that only 3-5% of the water samples were positive for human fecal signals, although a higher percentage of human-associated signals (19-24%) were detected in sediment samples. Phylogenetic analysis indicated that 57% of all water clones clustered with yet-to-be-cultured Bacteroidetes species associated with sequences obtained from ruminant feces, further supporting the prevalence of ruminant contamination in this watershed. However, since several clusters contained sequences from multiple sources, future studies need to consider the potential cosmopolitan nature of these bacterial populations when assessing fecal pollution sources using Bacteroidetes markers. Moreover, additional data is needed in order to understand the distribution of Bacteroidetes host-specific markers and their relationship to water quality regulatory standards. ?? 2006 Federation of European Microbiological Societies.

  7. Patterns of gene flow and selection across multiple species of Acrocephalus warblers: footprints of parallel selection on the Z chromosome.

    PubMed

    Reifová, Radka; Majerová, Veronika; Reif, Jiří; Ahola, Markus; Lindholm, Antero; Procházka, Petr

    2016-06-16

    Understanding the mechanisms and selective forces leading to adaptive radiations and origin of biodiversity is a major goal of evolutionary biology. Acrocephalus warblers are small passerines that underwent an adaptive radiation in the last approximately 10 million years that gave rise to 37 extant species, many of which still hybridize in nature. Acrocephalus warblers have served as model organisms for a wide variety of ecological and behavioral studies, yet our knowledge of mechanisms and selective forces driving their radiation is limited. Here we studied patterns of interspecific gene flow and selection across three European Acrocephalus warblers to get a first insight into mechanisms of radiation of this avian group. We analyzed nucleotide variation at eight nuclear loci in three hybridizing Acrocephalus species with overlapping breeding ranges in Europe. Using an isolation-with-migration model for multiple populations, we found evidence for unidirectional gene flow from A. scirpaceus to A. palustris and from A. palustris to A. dumetorum. Gene flow was higher between genetically more closely related A. scirpaceus and A. palustris than between ecologically more similar A. palustris and A. dumetorum, suggesting that gradual accumulation of intrinsic barriers rather than divergent ecological selection are more efficient in restricting interspecific gene flow in Acrocephalus warblers. Although levels of genetic differentiation between different species pairs were in general not correlated, we found signatures of apparently independent instances of positive selection at the same two Z-linked loci in multiple species. Our study brings the first evidence that gene flow occurred during Acrocephalus radiation and not only between sister species. Interspecific gene flow could thus be an important source of genetic variation in individual Acrocephalus species and could have accelerated adaptive evolution and speciation rate in this avian group by creating novel genetic combinations and new phenotypes. Independent instances of positive selection at the same loci in multiple species indicate an interesting possibility that the same loci might have contributed to reproductive isolation in several speciation events.

  8. Fructose metabolism in the cerebellum.

    PubMed

    Funari, Vincent A; Crandall, James E; Tolan, Dean R

    2007-01-01

    Under normal physiological conditions, the brain utilizes only a small number of carbon sources for energy. Recently, there is growing molecular and biochemical evidence that other carbon sources, including fructose, may play a role in neuro-energetics. Fructose is the number one commercial sweetener in Western civilization with large amounts of fructose being toxic, yet fructose metabolism remains relatively poorly characterized. Fructose is purportedly metabolized via either of two pathways, the fructose-1-phosphate pathway and/or the fructose-6-phosphate pathway. Many early metabolic studies could not clearly discriminate which of these two pathways predominates, nor could they distinguish which cell types in various tissues are capable of fructose metabolism. In addition, the lack of good physiological models, the diet-induced changes in gene expression in many tissues, the involvement of multiple genes in multiple pathways involved in fructose metabolism, and the lack of characterization of some genes involved in fructose metabolism have complicated our understanding of the physiological role of fructose in neuro-energetics. A recent neuro-metabolism study of the cerebellum demonstrated fructose metabolism and co-expression of the genes specific for the fructose 1-phosphate pathway, GLUT5 (glut5) and ketohexokinase (khk), in Purkinje cells suggesting this as an active pathway in specific neurons? Meanwhile, concern over the rapid increase in dietary fructose, particularly among children, has increased awareness about how fructose is metabolized in vivo and what effects a high fructose diet might have. In this regard, establishment of cellular and molecular studies and physiological characterization of the important and/or deleterious roles fructose plays in the brain is critical. This review will discuss the status of fructose metabolism in the brain with special reference to the cerebellum and the physiological roles of the different pathways.

  9. GeneSCF: a real-time based functional enrichment tool with support for multiple organisms.

    PubMed

    Subhash, Santhilal; Kanduri, Chandrasekhar

    2016-09-13

    High-throughput technologies such as ChIP-sequencing, RNA-sequencing, DNA sequencing and quantitative metabolomics generate a huge volume of data. Researchers often rely on functional enrichment tools to interpret the biological significance of the affected genes from these high-throughput studies. However, currently available functional enrichment tools need to be updated frequently to adapt to new entries from the functional database repositories. Hence there is a need for a simplified tool that can perform functional enrichment analysis by using updated information directly from the source databases such as KEGG, Reactome or Gene Ontology etc. In this study, we focused on designing a command-line tool called GeneSCF (Gene Set Clustering based on Functional annotations), that can predict the functionally relevant biological information for a set of genes in a real-time updated manner. It is designed to handle information from more than 4000 organisms from freely available prominent functional databases like KEGG, Reactome and Gene Ontology. We successfully employed our tool on two of published datasets to predict the biologically relevant functional information. The core features of this tool were tested on Linux machines without the need for installation of more dependencies. GeneSCF is more reliable compared to other enrichment tools because of its ability to use reference functional databases in real-time to perform enrichment analysis. It is an easy-to-integrate tool with other pipelines available for downstream analysis of high-throughput data. More importantly, GeneSCF can run multiple gene lists simultaneously on different organisms thereby saving time for the users. Since the tool is designed to be ready-to-use, there is no need for any complex compilation and installation procedures.

  10. A genomic island integrated into recA of Vibrio cholerae contains a divergent recA and provides multi-pathway protection from DNA damage.

    PubMed

    Rapa, Rita A; Islam, Atiqul; Monahan, Leigh G; Mutreja, Ankur; Thomson, Nicholas; Charles, Ian G; Stokes, Harold W; Labbate, Maurizio

    2015-04-01

    Lateral gene transfer (LGT) has been crucial in the evolution of the cholera pathogen, Vibrio cholerae. The two major virulence factors are present on two different mobile genetic elements, a bacteriophage containing the cholera toxin genes and a genomic island (GI) containing the intestinal adhesin genes. Non-toxigenic V. cholerae in the aquatic environment are a major source of novel DNA that allows the pathogen to morph via LGT. In this study, we report a novel GI from a non-toxigenic V. cholerae strain containing multiple genes involved in DNA repair including the recombination repair gene recA that is 23% divergent from the indigenous recA and genes involved in the translesion synthesis pathway. This is the first report of a GI containing the critical gene recA and the first report of a GI that targets insertion into a specific site within recA. We show that possession of the island in Escherichia coli is protective against DNA damage induced by UV-irradiation and DNA targeting antibiotics. This study highlights the importance of genetic elements such as GIs in the evolution of V. cholerae and emphasizes the importance of environmental strains as a source of novel DNA that can influence the pathogenicity of toxigenic strains. © 2014 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  11. Warehousing re-annotated cancer genes for biomarker meta-analysis.

    PubMed

    Orsini, M; Travaglione, A; Capobianco, E

    2013-07-01

    Translational research in cancer genomics assigns a fundamental role to bioinformatics in support of candidate gene prioritization with regard to both biomarker discovery and target identification for drug development. Efforts in both such directions rely on the existence and constant update of large repositories of gene expression data and omics records obtained from a variety of experiments. Users who interactively interrogate such repositories may have problems in retrieving sample fields that present limited associated information, due for instance to incomplete entries or sometimes unusable files. Cancer-specific data sources present similar problems. Given that source integration usually improves data quality, one of the objectives is keeping the computational complexity sufficiently low to allow an optimal assimilation and mining of all the information. In particular, the scope of integrating intraomics data can be to improve the exploration of gene co-expression landscapes, while the scope of integrating interomics sources can be that of establishing genotype-phenotype associations. Both integrations are relevant to cancer biomarker meta-analysis, as the proposed study demonstrates. Our approach is based on re-annotating cancer-specific data available at the EBI's ArrayExpress repository and building a data warehouse aimed to biomarker discovery and validation studies. Cancer genes are organized by tissue with biomedical and clinical evidences combined to increase reproducibility and consistency of results. For better comparative evaluation, multiple queries have been designed to efficiently address all types of experiments and platforms, and allow for retrieval of sample-related information, such as cell line, disease state and clinical aspects. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  12. Correlated noise-based switches and stochastic resonance in a bistable genetic regulation system

    NASA Astrophysics Data System (ADS)

    Wang, Can-Jun; Yang, Ke-Li

    2016-07-01

    The correlated noise-based switches and stochastic resonance are investigated in a bistable single gene switching system driven by an additive noise (environmental fluctuations), a multiplicative noise (fluctuations of the degradation rate). The correlation between the two noise sources originates from on the lysis-lysogeny pathway system of the λ phage. The steady state probability distribution is obtained by solving the time-independent Fokker-Planck equation, and the effects of noises are analyzed. The effects of noises on the switching time between the two stable states (mean first passage time) is investigated by the numerical simulation. The stochastic resonance phenomenon is analyzed by the power amplification factor. The results show that the multiplicative noise can induce the switching from "on" → "off" of the protein production, while the additive noise and the correlation between the noise sources can induce the inverse switching "off" → "on". A nonmonotonic behaviour of the average switching time versus the multiplicative noise intensity, for different cross-correlation and additive noise intensities, is observed in the genetic system. There exist optimal values of the additive noise, multiplicative noise and cross-correlation intensities for which the weak signal can be optimal amplified.

  13. Gene: a gene-centered information resource at NCBI.

    PubMed

    Brown, Garth R; Hem, Vichet; Katz, Kenneth S; Ovetsky, Michael; Wallin, Craig; Ermolaeva, Olga; Tolstoy, Igor; Tatusova, Tatiana; Pruitt, Kim D; Maglott, Donna R; Murphy, Terence D

    2015-01-01

    The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  14. Mapping of powdery mildew resistance gene Pm53 introgressed from Aegilops speltoides into soft red winter wheat.

    PubMed

    Petersen, Stine; Lyerly, Jeanette H; Worthington, Margaret L; Parks, Wesley R; Cowger, Christina; Marshall, David S; Brown-Guedira, Gina; Murphy, J Paul

    2015-02-01

    A powdery mildew resistance gene was introgressed from Aegilops speltoides into winter wheat and mapped to chromosome 5BL. Closely linked markers will permit marker-assisted selection for the resistance gene. Powdery mildew of wheat (Triticum aestivum L.) is a major fungal disease in many areas of the world, caused by Blumeria graminis f. sp. tritici (Bgt). Host plant resistance is the preferred form of disease prevention because it is both economical and environmentally sound. Identification of new resistance sources and closely linked markers enable breeders to utilize these new sources in marker-assisted selection as well as in gene pyramiding. Aegilops speltoides (2n = 2x = 14, genome SS), has been a valuable disease resistance donor. The powdery mildew resistant wheat germplasm line NC09BGTS16 (NC-S16) was developed by backcrossing an Ae. speltoides accession, TAU829, to the susceptible soft red winter wheat cultivar 'Saluda'. NC-S16 was crossed to the susceptible cultivar 'Coker 68-15' to develop F2:3 families for gene mapping. Greenhouse and field evaluations of these F2:3 families indicated that a single gene, designated Pm53, conferred resistance to powdery mildew. Bulked segregant analysis showed that multiple simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers specific to chromosome 5BL segregated with the resistance gene. The gene was flanked by markers Xgwm499, Xwmc759, IWA6024 (0.7 cM proximal) and IWA2454 (1.8 cM distal). Pm36, derived from a different wild wheat relative (T. turgidum var. dicoccoides), had previously been mapped to chromosome 5BL in a durum wheat line. Detached leaf tests revealed that NC-S16 and a genotype carrying Pm36 differed in their responses to each of three Bgt isolates. Pm53 therefore appears to be a new source of powdery mildew resistance.

  15. Functional Interaction Network Construction and Analysis for Disease Discovery.

    PubMed

    Wu, Guanming; Haw, Robin

    2017-01-01

    Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.

  16. Transparent mediation-based access to multiple yeast data sources using an ontology driven interface.

    PubMed

    Briache, Abdelaali; Marrakchi, Kamar; Kerzazi, Amine; Navas-Delgado, Ismael; Rossi Hassani, Badr D; Lairini, Khalid; Aldana-Montes, José F

    2012-01-25

    Saccharomyces cerevisiae is recognized as a model system representing a simple eukaryote whose genome can be easily manipulated. Information solicited by scientists on its biological entities (Proteins, Genes, RNAs...) is scattered within several data sources like SGD, Yeastract, CYGD-MIPS, BioGrid, PhosphoGrid, etc. Because of the heterogeneity of these sources, querying them separately and then manually combining the returned results is a complex and time-consuming task for biologists most of whom are not bioinformatics expert. It also reduces and limits the use that can be made on the available data. To provide transparent and simultaneous access to yeast sources, we have developed YeastMed: an XML and mediator-based system. In this paper, we present our approach in developing this system which takes advantage of SB-KOM to perform the query transformation needed and a set of Data Services to reach the integrated data sources. The system is composed of a set of modules that depend heavily on XML and Semantic Web technologies. User queries are expressed in terms of a domain ontology through a simple form-based web interface. YeastMed is the first mediation-based system specific for integrating yeast data sources. It was conceived mainly to help biologists to find simultaneously relevant data from multiple data sources. It has a biologist-friendly interface easy to use. The system is available at http://www.khaos.uma.es/yeastmed/.

  17. Catabolic regulation analysis of Escherichia coli and its crp, mlc, mgsA, pgi and ptsG mutants

    PubMed Central

    2011-01-01

    Background Most bacteria can use various compounds as carbon sources. These carbon sources can be either co-metabolized or sequentially metabolized, where the latter phenomenon typically occurs as catabolite repression. From the practical application point of view of utilizing lignocellulose for the production of biofuels etc., it is strongly desirable to ferment all sugars obtained by hydrolysis from lignocellulosic materials, where simultaneous consumption of sugars would benefit the formation of bioproducts. However, most organisms consume glucose prior to consumption of other carbon sources, and exhibit diauxic growth. It has been shown by fermentation experiments that simultaneous consumption of sugars can be attained by ptsG, mgsA mutants etc., but its mechanism has not been well understood. It is strongly desirable to understand the mechanism of metabolic regulation for catabolite regulation to improve the performance of fermentation. Results In order to make clear the catabolic regulation mechanism, several continuous cultures were conducted at different dilution rates of 0.2, 0.4, 0.6 and 0.7 h-1 using wild type Escherichia coli. The result indicates that the transcript levels of global regulators such as crp, cra, mlc and rpoS decreased, while those of fadR, iclR, soxR/S increased as the dilution rate increased. These affected the metabolic pathway genes, which in turn affected fermentation result where the specific glucose uptake rate, the specific acetate formation rate, and the specific CO2 evolution rate (CER) were increased as the dilution rate was increased. This was confirmed by the 13C-flux analysis. In order to make clear the catabolite regulation, the effect of crp gene knockout (Δcrp) and crp enhancement (crp+) as well as mlc, mgsA, pgi and ptsG gene knockout on the metabolism was then investigated by the continuous culture at the dilution rate of 0.2 h-1 and by some batch cultures. In the case of Δcrp (and also Δmlc) mutant, TCA cycle and glyoxylate were repressed, which caused acetate accumulation. In the case of crp+ mutant, glycolysis, TCA cycle, and gluconeogenesis were activated, and simultaneous consumption of multiple carbon sources can be attained, but the glucose consumption rate became less due to repression of ptsG and ptsH by the activation of Mlc. Simultaneous consumption of multiple carbon sources could be attained by mgsA, pgi, and ptsG mutants due to increase in crp as well as cyaA, while glucose consumption rate became lower. Conclusions The transcriptional catabolite regulation mechanism was made clear for the wild type E. coli, and its crp, mlc, ptsG, pgi, and mgsA gene knockout mutants. The results indicate that catabolite repression can be relaxed and crp as well as cyaA can be increased by crp+, mgsA, pgi, and ptsG mutants, and thus simultaneous consumption of multiple carbon sources including glucose can be made, whereas the glucose uptake rate became lower as compared to wild type due to inactivation of ptsG in all the mutants considered. PMID:21831320

  18. PGASO: A synthetic biology tool for engineering a cellulolytic yeast

    PubMed Central

    2012-01-01

    Background To achieve an economical cellulosic ethanol production, a host that can do both cellulosic saccharification and ethanol fermentation is desirable. However, to engineer a non-cellulolytic yeast to be such a host requires synthetic biology techniques to transform multiple enzyme genes into its genome. Results A technique, named Promoter-based Gene Assembly and Simultaneous Overexpression (PGASO), that employs overlapping oligonucleotides for recombinatorial assembly of gene cassettes with individual promoters, was developed. PGASO was applied to engineer Kluyveromycesmarxianus KY3, which is a thermo- and toxin-tolerant yeast. We obtained a recombinant strain, called KR5, that is capable of simultaneously expressing exoglucanase and endoglucanase (both of Trichodermareesei), a beta-glucosidase (from a cow rumen fungus), a neomycin phosphotransferase, and a green fluorescent protein. High transformation efficiency and accuracy were achieved as ~63% of the transformants was confirmed to be correct. KR5 can utilize beta-glycan, cellobiose or CMC as the sole carbon source for growth and can directly convert cellobiose and beta-glycan to ethanol. Conclusions This study provides the first example of multi-gene assembly in a single step in a yeast species other than Saccharomyces cerevisiae. We successfully engineered a yeast host with a five-gene cassette assembly and the new host is capable of co-expressing three types of cellulase genes. Our study shows that PGASO is an efficient tool for simultaneous expression of multiple enzymes in the kefir yeast KY3 and that KY3 can serve as a host for developing synthetic biology tools. PMID:22839502

  19. pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

    PubMed

    Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

    2013-08-01

    With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  20. Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.

    PubMed

    Kurian, Allison W; Ward, Kevin C; Hamilton, Ann S; Deapen, Dennis M; Abrahamse, Paul; Bondarenko, Irina; Li, Yun; Hawley, Sarah T; Morrow, Monica; Jagsi, Reshma; Katz, Steven J

    2018-05-10

    Low-cost sequencing of multiple genes is increasingly available for cancer risk assessment. Little is known about uptake or outcomes of multiple-gene sequencing after breast cancer diagnosis in community practice. To examine the effect of multiple-gene sequencing on the experience and treatment outcomes for patients with breast cancer. For this population-based retrospective cohort study, patients with breast cancer diagnosed from January 2013 to December 2015 and accrued from SEER registries across Georgia and in Los Angeles, California, were surveyed (n = 5080, response rate = 70%). Responses were merged with SEER data and results of clinical genetic tests, either BRCA1 and BRCA2 (BRCA1/2) sequencing only or including additional other genes (multiple-gene sequencing), provided by 4 laboratories. Type of testing (multiple-gene sequencing vs BRCA1/2-only sequencing), test results (negative, variant of unknown significance, or pathogenic variant), patient experiences with testing (timing of testing, who discussed results), and treatment (strength of patient consideration of, and surgeon recommendation for, prophylactic mastectomy), and prophylactic mastectomy receipt. We defined a patient subgroup with higher pretest risk of carrying a pathogenic variant according to practice guidelines. Among 5026 patients (mean [SD] age, 59.9 [10.7]), 1316 (26.2%) were linked to genetic results from any laboratory. Multiple-gene sequencing increasingly replaced BRCA1/2-only testing over time: in 2013, the rate of multiple-gene sequencing was 25.6% and BRCA1/2-only testing, 74.4%;in 2015 the rate of multiple-gene sequencing was 66.5% and BRCA1/2-only testing, 33.5%. Multiple-gene sequencing was more often ordered by genetic counselors (multiple-gene sequencing, 25.5% and BRCA1/2-only testing, 15.3%) and delayed until after surgery (multiple-gene sequencing, 32.5% and BRCA1/2-only testing, 19.9%). Multiple-gene sequencing substantially increased rate of detection of any pathogenic variant (multiple-gene sequencing: higher-risk patients, 12%; average-risk patients, 4.2% and BRCA1/2-only testing: higher-risk patients, 7.8%; average-risk patients, 2.2%) and variants of uncertain significance, especially in minorities (multiple-gene sequencing: white patients, 23.7%; black patients, 44.5%; and Asian patients, 50.9% and BRCA1/2-only testing: white patients, 2.2%; black patients, 5.6%; and Asian patients, 0%). Multiple-gene sequencing was not associated with an increase in the rate of prophylactic mastectomy use, which was highest with pathogenic variants in BRCA1/2 (BRCA1/2, 79.0%; other pathogenic variant, 37.6%; variant of uncertain significance, 30.2%; negative, 35.3%). Multiple-gene sequencing rapidly replaced BRCA1/2-only testing for patients with breast cancer in the community and enabled 2-fold higher detection of clinically relevant pathogenic variants without an associated increase in prophylactic mastectomy. However, important targets for improvement in the clinical utility of multiple-gene sequencing include postsurgical delay and racial/ethnic disparity in variants of uncertain significance.

  1. Integrated Analysis of Mutation Data from Various Sources Identifies Key Genes and Signaling Pathways in Hepatocellular Carcinoma

    PubMed Central

    Wei, Lin; Tang, Ruqi; Lian, Baofeng; Zhao, Yingjun; He, Xianghuo; Xie, Lu

    2014-01-01

    Background Recently, a number of studies have performed genome or exome sequencing of hepatocellular carcinoma (HCC) and identified hundreds or even thousands of mutations in protein-coding genes. However, these studies have only focused on a limited number of candidate genes, and many important mutation resources remain to be explored. Principal Findings In this study, we integrated mutation data obtained from various sources and performed pathway and network analysis. We identified 113 pathways that were significantly mutated in HCC samples and found that the mutated genes included in these pathways contained high percentages of known cancer genes, and damaging genes and also demonstrated high conservation scores, indicating their important roles in liver tumorigenesis. Five classes of pathways that were mutated most frequently included (a) proliferation and apoptosis related pathways, (b) tumor microenvironment related pathways, (c) neural signaling related pathways, (d) metabolic related pathways, and (e) circadian related pathways. Network analysis further revealed that the mutated genes with the highest betweenness coefficients, such as the well-known cancer genes TP53, CTNNB1 and recently identified novel mutated genes GNAL and the ADCY family, may play key roles in these significantly mutated pathways. Finally, we highlight several key genes (e.g., RPS6KA3 and PCLO) and pathways (e.g., axon guidance) in which the mutations were associated with clinical features. Conclusions Our workflow illustrates the increased statistical power of integrating multiple studies of the same subject, which can provide biological insights that would otherwise be masked under individual sample sets. This type of bioinformatics approach is consistent with the necessity of making the best use of the ever increasing data provided in valuable databases, such as TCGA, to enhance the speed of deciphering human cancers. PMID:24988079

  2. Integrated analysis of mutation data from various sources identifies key genes and signaling pathways in hepatocellular carcinoma.

    PubMed

    Zhang, Yuannv; Qiu, Zhaoping; Wei, Lin; Tang, Ruqi; Lian, Baofeng; Zhao, Yingjun; He, Xianghuo; Xie, Lu

    2014-01-01

    Recently, a number of studies have performed genome or exome sequencing of hepatocellular carcinoma (HCC) and identified hundreds or even thousands of mutations in protein-coding genes. However, these studies have only focused on a limited number of candidate genes, and many important mutation resources remain to be explored. In this study, we integrated mutation data obtained from various sources and performed pathway and network analysis. We identified 113 pathways that were significantly mutated in HCC samples and found that the mutated genes included in these pathways contained high percentages of known cancer genes, and damaging genes and also demonstrated high conservation scores, indicating their important roles in liver tumorigenesis. Five classes of pathways that were mutated most frequently included (a) proliferation and apoptosis related pathways, (b) tumor microenvironment related pathways, (c) neural signaling related pathways, (d) metabolic related pathways, and (e) circadian related pathways. Network analysis further revealed that the mutated genes with the highest betweenness coefficients, such as the well-known cancer genes TP53, CTNNB1 and recently identified novel mutated genes GNAL and the ADCY family, may play key roles in these significantly mutated pathways. Finally, we highlight several key genes (e.g., RPS6KA3 and PCLO) and pathways (e.g., axon guidance) in which the mutations were associated with clinical features. Our workflow illustrates the increased statistical power of integrating multiple studies of the same subject, which can provide biological insights that would otherwise be masked under individual sample sets. This type of bioinformatics approach is consistent with the necessity of making the best use of the ever increasing data provided in valuable databases, such as TCGA, to enhance the speed of deciphering human cancers.

  3. Effects of enamel matrix genes on dental caries are moderated by fluoride exposures

    PubMed Central

    Shaffer, John R.; Carlson, Jenna C.; Stanley, Brooklyn O. C.; Feingold, Eleanor; Cooper, Margaret; Vanyukov, Michael M.; Maher, Brion S.; Slayton, Rebecca L.; Willing, Marcia C.; Reis, Steven E.; McNeil, Daniel W.; Crout, Richard J.; Weyant, Robert J.; Levy, Steven M.; Vieira, Alexandre R.; Marazita, Mary L.

    2014-01-01

    Dental caries (tooth decay) is the most common chronic disease, worldwide, affecting most children and adults. Though dental caries is highly heritable, few caries-related genes have been discovered. We investigated whether 18 genetic variants in the group of nonamelogenin enamel matrix genes (AMBN, ENAM, TUFT1, and TFIP11) were associated with dental caries experience in 13 age- and race-stratified samples from six parent studies (N=3,600). Linear regression was used to model genetic associations and test gene-byfluoride interaction effects for two sources of fluoride: daily tooth brushing and home water fluoride concentration. Meta-analysis was used to combine results across five child and eight adult samples. We observed the statistically significant association of rs2337359 upstream of TUFT1 with dental caries experience via meta-analysis across adult samples (p<0.002) and the suggestive association for multiple variants in TFIP11 across child samples (p<0.05). Moreover, we discovered two genetic variants (rs2337359 upstream of TUFT1 and missense rs7439186 in AMBN) involved in gene-by-fluoride interactions. For each interaction, participants with the risk allele/genotype exhibited greater dental caries experience only if they were not exposed to the source of fluoride. Altogether, these results confirm that variation in enamel matrix genes contributes to individual differences in dental caries liability, and demonstrate that the effects of these genes may be moderated by protective fluoride exposures. In short, genes may exert greater influence on dental caries in unprotected environments, or equivalently, the protective effects of fluoride may obviate the effects of genetic risk alleles. PMID:25373699

  4. Diversity of Antimicrobial Resistance and Virulence Determinants in Pseudomonas aeruginosa Associated with Fresh Vegetables

    PubMed Central

    Allydice-Francis, Kashina; Brown, Paul D.

    2012-01-01

    With the increased focus on healthy eating and consuming raw vegetables, this study assessed the extent of contamination of fresh vegetables by Pseudomonas aeruginosa in Jamaica and examined the antibiotic susceptibility profiles and the presence of various virulence associated determinants of P. aeruginosa. Analyses indicated that vegetables from retail markets and supermarkets were widely contaminated by P. aeruginosa; produce from markets were more frequently contaminated, but the difference was not significant. Lettuce and carrots were the most frequently contaminated vegetables, while tomatoes were the least. Pigment production (Pyoverdine, pyocyanin, pyomelanin and pyorubin), fluorescein and alginate were common in these isolates. Imipenem, gentamicin and ciprofloxacin were the most inhibitory antimicrobial agents. However, isolates were resistant or showed reduced susceptibility to ampicillin, chloramphenicol, sulphamethoxazole/trimethoprim and aztreonam, and up to 35% of the isolates were resistant to four antimicrobial agents. As many as 30% of the isolates were positive for the fpv1 gene, and 13% had multiple genes. Sixty-four percent of the isolates harboured an exoenzyme gene (exoS, exoT, exoU or exoY), and multiple exo genes were common. We conclude that P. aeruginosa is a major contaminant of fresh vegetables, which might be a source of infection for susceptible persons within the community. PMID:23213336

  5. Biological data warehousing system for identifying transcriptional regulatory sites from gene expressions of microarray data.

    PubMed

    Tsou, Ann-Ping; Sun, Yi-Ming; Liu, Chia-Lin; Huang, Hsien-Da; Horng, Jorng-Tzong; Tsai, Meng-Feng; Liu, Baw-Juine

    2006-07-01

    Identification of transcriptional regulatory sites plays an important role in the investigation of gene regulation. For this propose, we designed and implemented a data warehouse to integrate multiple heterogeneous biological data sources with data types such as text-file, XML, image, MySQL database model, and Oracle database model. The utility of the biological data warehouse in predicting transcriptional regulatory sites of coregulated genes was explored using a synexpression group derived from a microarray study. Both of the binding sites of known transcription factors and predicted over-represented (OR) oligonucleotides were demonstrated for the gene group. The potential biological roles of both known nucleotides and one OR nucleotide were demonstrated using bioassays. Therefore, the results from the wet-lab experiments reinforce the power and utility of the data warehouse as an approach to the genome-wide search for important transcription regulatory elements that are the key to many complex biological systems.

  6. Identification of Single- and Multiple-Class Specific Signature Genes from Gene Expression Profiles by Group Marker Index

    PubMed Central

    Tsai, Yu-Shuen; Aguan, Kripamoy; Pal, Nikhil R.; Chung, I-Fang

    2011-01-01

    Informative genes from microarray data can be used to construct prediction model and investigate biological mechanisms. Differentially expressed genes, the main targets of most gene selection methods, can be classified as single- and multiple-class specific signature genes. Here, we present a novel gene selection algorithm based on a Group Marker Index (GMI), which is intuitive, of low-computational complexity, and efficient in identification of both types of genes. Most gene selection methods identify only single-class specific signature genes and cannot identify multiple-class specific signature genes easily. Our algorithm can detect de novo certain conditions of multiple-class specificity of a gene and makes use of a novel non-parametric indicator to assess the discrimination ability between classes. Our method is effective even when the sample size is small as well as when the class sizes are significantly different. To compare the effectiveness and robustness we formulate an intuitive template-based method and use four well-known datasets. We demonstrate that our algorithm outperforms the template-based method in difficult cases with unbalanced distribution. Moreover, the multiple-class specific genes are good biomarkers and play important roles in biological pathways. Our literature survey supports that the proposed method identifies unique multiple-class specific marker genes (not reported earlier to be related to cancer) in the Central Nervous System data. It also discovers unique biomarkers indicating the intrinsic difference between subtypes of lung cancer. We also associate the pathway information with the multiple-class specific signature genes and cross-reference to published studies. We find that the identified genes participate in the pathways directly involved in cancer development in leukemia data. Our method gives a promising way to find genes that can involve in pathways of multiple diseases and hence opens up the possibility of using an existing drug on other diseases as well as designing a single drug for multiple diseases. PMID:21909426

  7. Examination of the Source and Extended Virulence Genotypes of Escherichia coli Contaminating Retail Poultry Meat

    PubMed Central

    Johnson, Timothy J.; Logue, Catherine M.; Wannemuehler, Yvonne; Kariyawasam, Subhashinie; Doetkott, Curt; DebRoy, Chitrita; White, David G.

    2009-01-01

    Abstract Extraintestinal pathogenic Escherichia coli (ExPEC) are major players in human urinary tract infections, neonatal bacterial meningitis, and sepsis. Recently, it has been suggested that there might be a zoonotic component to these infections. To determine whether the E. coli contaminating retail poultry are possible extraintestinal pathogens, and to ascertain the source of these contaminants, they were assessed for their genetic similarities to E. coli incriminated in colibacillosis (avian pathogenic E. coli [APEC]), E. coli isolated from multiple locations of apparently healthy birds at slaughter, and human ExPEC. It was anticipated that the retail poultry isolates would most closely resemble avian fecal E. coli since only apparently healthy birds are slaughtered, and fecal contamination of carcasses is the presumed source of meat contamination. Surprisingly, this supposition proved incorrect, as the retail poultry isolates exhibited gene profiles more similar to APEC than to fecal isolates. These isolates contained a number of ExPEC-associated genes, including those associated with ColV virulence plasmids, and many belonged to the B2 phylogenetic group, known to be virulent in human hosts. Additionally, E. coli isolated from the crops and gizzards of apparently healthy birds at slaughter also contained a higher proportion of ExPEC-associated genes than did the avian fecal isolates examined. Such similarities suggest that the widely held beliefs about the sources of poultry contamination may need to be reassessed. Also, the presence of ExPEC-like clones on retail poultry meat means that we cannot yet rule out poultry as a source of ExPEC human disease. PMID:19580453

  8. An integrated network of Arabidopsis growth regulators and its use for gene prioritization.

    PubMed

    Sabaghian, Ehsan; Drebert, Zuzanna; Inzé, Dirk; Saeys, Yvan

    2015-12-01

    Elucidating the molecular mechanisms that govern plant growth has been an important topic in plant research, and current advances in large-scale data generation call for computational tools that efficiently combine these different data sources to generate novel hypotheses. In this work, we present a novel, integrated network that combines multiple large-scale data sources to characterize growth regulatory genes in Arabidopsis, one of the main plant model organisms. The contributions of this work are twofold: first, we characterized a set of carefully selected growth regulators with respect to their connectivity patterns in the integrated network, and, subsequently, we explored to which extent these connectivity patterns can be used to suggest new growth regulators. Using a large-scale comparative study, we designed new supervised machine learning methods to prioritize growth regulators. Our results show that these methods significantly improve current state-of-the-art prioritization techniques, and are able to suggest meaningful new growth regulators. In addition, the integrated network is made available to the scientific community, providing a rich data source that will be useful for many biological processes, not necessarily restricted to plant growth.

  9. Carbon catabolite regulation in Streptomyces: new insights and lessons learned.

    PubMed

    Romero-Rodríguez, Alba; Rocha, Diana; Ruiz-Villafán, Beatriz; Guzmán-Trampe, Silvia; Maldonado-Carmona, Nidia; Vázquez-Hernández, Melissa; Zelarayán, Augusto; Rodríguez-Sanoja, Romina; Sánchez, Sergio

    2017-09-01

    One of the most significant control mechanisms of the physiological processes in the genus Streptomyces is carbon catabolite repression (CCR). This mechanism controls the expression of genes involved in the uptake and utilization of alternative carbon sources in Streptomyces and is mostly independent of the phosphoenolpyruvate phosphotransferase system (PTS). CCR also affects morphological differentiation and the synthesis of secondary metabolites, although not all secondary metabolite genes are equally sensitive to the control by the carbon source. Even when the outcome effect of CCR in bacteria is the same, their essential mechanisms can be rather different. Although usually, glucose elicits this phenomenon, other rapidly metabolized carbon sources can also cause CCR. Multiple efforts have been put through to the understanding of the mechanism of CCR in this genus. However, a reasonable mechanism to explain the nature of this process in Streptomyces does not yet exist. Several examples of primary and secondary metabolites subject to CCR will be examined in this review. Additionally, recent advances in the metabolites and protein factors involved in the Streptomyces CCR, as well as their mechanisms will be described and discussed in this review.

  10. Turning publicly available gene expression data into discoveries using gene set context analysis.

    PubMed

    Ji, Zhicheng; Vokes, Steven A; Dang, Chi V; Ji, Hongkai

    2016-01-08

    Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.

    PubMed

    Eernisse, D J

    1992-04-01

    DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.

  12. Transcriptional response according to strength of calorie restriction in Saccharomyces cerevisiae.

    PubMed

    Lee, Yae-Lim; Lee, Cheol-Koo

    2008-09-30

    To characterize gene expression that is dependent on the strength of calorie restriction (CR), we obtained transcriptome at different levels of glucose, which is a major energy and carbon source for budding yeast. To faithfully mimic mammalian CR in yeast culture, we reconstituted and grew seeding yeast cells in fresh 2% YPD media before inoculating into 2%, 1%, 0.5% and 0.25% YPD media to reflect different CR strengths. We collected and characterized 160 genes that responded to CR strength based on the rigorous statistical analyses of multiple test corrected ANOVA (adjusted p0.7). Based on the individual gene studies and the GO Term Finder analysis of 160 genes, we found that CR dose-dependently and gradually increased mitochondrial function at the transcriptional level. Therefore, we suggest these 160 genes are markers that respond to CR strength and that might be useful in elucidating CR mechanisms, especially how stronger CR extends life span more.

  13. DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

    PubMed

    Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

    2017-01-07

    Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.

  14. Phylogenetic classification and the universal tree.

    PubMed

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished.

  15. A Bayesian Supertree Model for Genome-Wide Species Tree Reconstruction

    PubMed Central

    De Oliveira Martins, Leonardo; Mallo, Diego; Posada, David

    2016-01-01

    Current phylogenomic data sets highlight the need for species tree methods able to deal with several sources of gene tree/species tree incongruence. At the same time, we need to make most use of all available data. Most species tree methods deal with single processes of phylogenetic discordance, namely, gene duplication and loss, incomplete lineage sorting (ILS) or horizontal gene transfer. In this manuscript, we address the problem of species tree inference from multilocus, genome-wide data sets regardless of the presence of gene duplication and loss and ILS therefore without the need to identify orthologs or to use a single individual per species. We do this by extending the idea of Maximum Likelihood (ML) supertrees to a hierarchical Bayesian model where several sources of gene tree/species tree disagreement can be accounted for in a modular manner. We implemented this model in a computer program called guenomu whose inputs are posterior distributions of unrooted gene tree topologies for multiple gene families, and whose output is the posterior distribution of rooted species tree topologies. We conducted extensive simulations to evaluate the performance of our approach in comparison with other species tree approaches able to deal with more than one leaf from the same species. Our method ranked best under simulated data sets, in spite of ignoring branch lengths, and performed well on empirical data, as well as being fast enough to analyze relatively large data sets. Our Bayesian supertree method was also very successful in obtaining better estimates of gene trees, by reducing the uncertainty in their distributions. In addition, our results show that under complex simulation scenarios, gene tree parsimony is also a competitive approach once we consider its speed, in contrast to more sophisticated models. PMID:25281847

  16. Phenome-driven disease genetics prediction toward drug discovery

    PubMed Central

    Chen, Yang; Li, Li; Zhang, Guo-Qiang; Xu, Rong

    2015-01-01

    Motivation: Discerning genetic contributions to diseases not only enhances our understanding of disease mechanisms, but also leads to translational opportunities for drug discovery. Recent computational approaches incorporate disease phenotypic similarities to improve the prediction power of disease gene discovery. However, most current studies used only one data source of human disease phenotype. We present an innovative and generic strategy for combining multiple different data sources of human disease phenotype and predicting disease-associated genes from integrated phenotypic and genomic data. Results: To demonstrate our approach, we explored a new phenotype database from biomedical ontologies and constructed Disease Manifestation Network (DMN). We combined DMN with mimMiner, which was a widely used phenotype database in disease gene prediction studies. Our approach achieved significantly improved performance over a baseline method, which used only one phenotype data source. In the leave-one-out cross-validation and de novo gene prediction analysis, our approach achieved the area under the curves of 90.7% and 90.3%, which are significantly higher than 84.2% (P < e−4) and 81.3% (P < e−12) for the baseline approach. We further demonstrated that our predicted genes have the translational potential in drug discovery. We used Crohn’s disease as an example and ranked the candidate drugs based on the rank of drug targets. Our gene prediction approach prioritized druggable genes that are likely to be associated with Crohn’s disease pathogenesis, and our rank of candidate drugs successfully prioritized the Food and Drug Administration-approved drugs for Crohn’s disease. We also found literature evidence to support a number of drugs among the top 200 candidates. In summary, we demonstrated that a novel strategy combining unique disease phenotype data with system approaches can lead to rapid drug discovery. Availability and implementation: nlp.case.edu/public/data/DMN Contact: rxx@case.edu PMID:26072493

  17. Breeding Vegetables with Increased Content in Bioactive Phenolic Acids.

    PubMed

    Kaushik, Prashant; Andújar, Isabel; Vilanova, Santiago; Plazas, Mariola; Gramazio, Pietro; Herraiz, Francisco Javier; Brar, Navjot Singh; Prohens, Jaime

    2015-10-09

    Vegetables represent a major source of phenolic acids, powerful antioxidants characterized by an organic carboxylic acid function and which present multiple properties beneficial for human health. In consequence, developing new varieties with enhanced content in phenolic acids is an increasingly important breeding objective. Major phenolic acids present in vegetables are derivatives of cinnamic acid and to a lesser extent of benzoic acid. A large diversity in phenolic acids content has been found among cultivars and wild relatives of many vegetable crops. Identification of sources of variation for phenolic acids content can be accomplished by screening germplasm collections, but also through morphological characteristics and origin, as well as by evaluating mutations in key genes. Gene action estimates together with relatively high values for heritability indicate that selection for enhanced phenolic acids content will be efficient. Modern genomics and biotechnological strategies, such as QTL detection, candidate genes approaches and genetic transformation, are powerful tools for identification of genomic regions and genes with a key role in accumulation of phenolic acids in vegetables. However, genetically increasing the content in phenolic acids may also affect other traits important for the success of a variety. We anticipate that the combination of conventional and modern strategies will facilitate the development of a new generation of vegetable varieties with enhanced content in phenolic acids.

  18. The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation

    PubMed Central

    Roger, Andrew J; Hug, Laura A

    2006-01-01

    Determining the relationships among and divergence times for the major eukaryotic lineages remains one of the most important and controversial outstanding problems in evolutionary biology. The sequencing and phylogenetic analyses of ribosomal RNA (rRNA) genes led to the first nearly comprehensive phylogenies of eukaryotes in the late 1980s, and supported a view where cellular complexity was acquired during the divergence of extant unicellular eukaryote lineages. More recently, however, refinements in analytical methods coupled with the availability of many additional genes for phylogenetic analysis showed that much of the deep structure of early rRNA trees was artefactual. Recent phylogenetic analyses of a multiple genes and the discovery of important molecular and ultrastructural phylogenetic characters have resolved eukaryotic diversity into six major hypothetical groups. Yet relationships among these groups remain poorly understood because of saturation of sequence changes on the billion-year time-scale, possible rapid radiations of major lineages, phylogenetic artefacts and endosymbiotic or lateral gene transfer among eukaryotes. Estimating the divergence dates between the major eukaryote lineages using molecular analyses is even more difficult than phylogenetic estimation. Error in such analyses comes from a myriad of sources including: (i) calibration fossil dates, (ii) the assumed phylogenetic tree, (iii) the nucleotide or amino acid substitution model, (iv) substitution number (branch length) estimates, (v) the model of how rates of evolution change over the tree, (vi) error inherent in the time estimates for a given model and (vii) how multiple gene data are treated. By reanalysing datasets from recently published molecular clock studies, we show that when errors from these various sources are properly accounted for, the confidence intervals on inferred dates can be very large. Furthermore, estimated dates of divergence vary hugely depending on the methods used and their assumptions. Accurate dating of divergence times among the major eukaryote lineages will require a robust tree of eukaryotes, a much richer Proterozoic fossil record of microbial eukaryotes assignable to extant groups for calibration, more sophisticated relaxed molecular clock methods and many more genes sampled from the full diversity of microbial eukaryotes. PMID:16754613

  19. Inductive matrix completion for predicting gene-disease associations.

    PubMed

    Natarajan, Nagarajan; Dhillon, Inderjit S

    2014-06-15

    Most existing methods for predicting causal disease genes rely on specific type of evidence, and are therefore limited in terms of applicability. More often than not, the type of evidence available for diseases varies-for example, we may know linked genes, keywords associated with the disease obtained by mining text, or co-occurrence of disease symptoms in patients. Similarly, the type of evidence available for genes varies-for example, specific microarray probes convey information only for certain sets of genes. In this article, we apply a novel matrix-completion method called Inductive Matrix Completion to the problem of predicting gene-disease associations; it combines multiple types of evidence (features) for diseases and genes to learn latent factors that explain the observed gene-disease associations. We construct features from different biological sources such as microarray expression data and disease-related textual data. A crucial advantage of the method is that it is inductive; it can be applied to diseases not seen at training time, unlike traditional matrix-completion approaches and network-based inference methods that are transductive. Comparison with state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database shows that the proposed approach is substantially better-it has close to one-in-four chance of recovering a true association in the top 100 predictions, compared to the recently proposed Catapult method (second best) that has <15% chance. We demonstrate that the inductive method is particularly effective for a query disease with no previously known gene associations, and for predicting novel genes, i.e. genes that are previously not linked to diseases. Thus the method is capable of predicting novel genes even for well-characterized diseases. We also validate the novelty of predictions by evaluating the method on recently reported OMIM associations and on associations recently reported in the literature. Source code and datasets can be downloaded from http://bigdata.ices.utexas.edu/project/gene-disease. © The Author 2014. Published by Oxford University Press.

  20. Bayesian approach to transforming public gene expression repositories into disease diagnosis databases.

    PubMed

    Huang, Haiyan; Liu, Chun-Chi; Zhou, Xianghong Jasmine

    2010-04-13

    The rapid accumulation of gene expression data has offered unprecedented opportunities to study human diseases. The National Center for Biotechnology Information Gene Expression Omnibus is currently the largest database that systematically documents the genome-wide molecular basis of diseases. However, thus far, this resource has been far from fully utilized. This paper describes the first study to transform public gene expression repositories into an automated disease diagnosis database. Particularly, we have developed a systematic framework, including a two-stage Bayesian learning approach, to achieve the diagnosis of one or multiple diseases for a query expression profile along a hierarchical disease taxonomy. Our approach, including standardizing cross-platform gene expression data and heterogeneous disease annotations, allows analyzing both sources of information in a unified probabilistic system. A high level of overall diagnostic accuracy was shown by cross validation. It was also demonstrated that the power of our method can increase significantly with the continued growth of public gene expression repositories. Finally, we showed how our disease diagnosis system can be used to characterize complex phenotypes and to construct a disease-drug connectivity map.

  1. Genes associated to lactose metabolism illustrate the high diversity of Carnobacterium maltaromaticum.

    PubMed

    Iskandar, Christelle F; Cailliez-Grimal, Catherine; Rahman, Abdur; Rondags, Emmanuel; Remenant, Benoît; Zagorec, Monique; Leisner, Jorgen J; Borges, Frédéric; Revol-Junelles, Anne-Marie

    2016-09-01

    The dairy population of Carnobacterium maltaromaticum is characterized by a high diversity suggesting a high diversity of the genetic traits linked to the dairy process. As lactose is the main carbon source in milk, the genetics of lactose metabolism was investigated in this LAB. Comparative genomic analysis revealed that the species C. maltaromaticum exhibits genes related to the Leloir and the tagatose-6-phosphate (Tagatose-6P) pathways. More precisely, strains can bear genes related to one or both pathways and several strains apparently do not contain homologs related to these pathways. Analysis at the population scale revealed that the Tagatose-6P and the Leloir encoding genes are disseminated in multiple phylogenetic lineages of C. maltaromaticum: genes of the Tagatose-6P pathway are present in the lineages I, II and III, and genes of the Leloir pathway are present in the lineages I, III and IV. These data suggest that these genes evolved thanks to horizontal transfer, genetic duplication and translocation. We hypothesize that the lac and gal genes evolved in C. maltaromaticum according to a complex scenario that mirrors the high population diversity. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Searching for disease-susceptibility loci by testing for Hardy-Weinberg disequilibrium in a gene bank of affected individuals.

    PubMed

    Lee, Wen-Chung

    2003-09-01

    The future of genetic studies of complex human diseases will rely more and more on the epidemiologic association paradigm. The author proposes to scan the genome for disease-susceptibility gene(s) by testing for deviation from Hardy-Weinberg equilibrium in a gene bank of affected individuals. A power formula is presented, which is very accurate as revealed by Monte Carlo simulations. If the disease-susceptibility gene is recessive with an allele frequency of < or = 0.5 or dominant with an allele frequency of > or = 0.5, the number of subjects needed by the present method is smaller than that needed by using a case-parents design (using either the transmission/disequilibrium test or the 2-df likelihood ratio test). However, the method cannot detect genes with a multiplicative mode of inheritance, and the validity of the method relies on the assumption that the source population from which the cases arise is in Hardy-Weinberg equilibrium. Thus, it is prone to produce false positive and false negative results. Nevertheless, the method enables rapid gene hunting in an existing gene bank of affected individuals with no extra effort beyond simple calculations.

  3. GeneNetFinder2: Improved Inference of Dynamic Gene Regulatory Relations with Multiple Regulators.

    PubMed

    Han, Kyungsook; Lee, Jeonghoon

    2016-01-01

    A gene involved in complex regulatory interactions may have multiple regulators since gene expression in such interactions is often controlled by more than one gene. Another thing that makes gene regulatory interactions complicated is that regulatory interactions are not static, but change over time during the cell cycle. Most research so far has focused on identifying gene regulatory relations between individual genes in a particular stage of the cell cycle. In this study we developed a method for identifying dynamic gene regulations of several types from the time-series gene expression data. The method can find gene regulations with multiple regulators that work in combination or individually as well as those with single regulators. The method has been implemented as the second version of GeneNetFinder (hereafter called GeneNetFinder2) and tested on several gene expression datasets. Experimental results with gene expression data revealed the existence of genes that are not regulated by individual genes but rather by a combination of several genes. Such gene regulatory relations cannot be found by conventional methods. Our method finds such regulatory relations as well as those with multiple, independent regulators or single regulators, and represents gene regulatory relations as a dynamic network in which different gene regulatory relations are shown in different stages of the cell cycle. GeneNetFinder2 is available at http://bclab.inha.ac.kr/GeneNetFinder and will be useful for modeling dynamic gene regulations with multiple regulators.

  4. The In Vitro Differentiation of GDNF Gene-Engineered Amniotic Fluid-Derived Stem Cells into Renal Tubular Epithelial-Like Cells.

    PubMed

    Lu, Ying; Wang, Zhuojun; Chen, Lu; Wang, Jia; Li, Shulin; Liu, Caixia; Sun, Dong

    2018-05-01

    Amniotic fluid is an alternative source of stem cells, and human amniotic fluid-derived stem cells (AFSCs) obtained from a small amount of amniotic fluid collected during the second trimester represent a novel source for use in regenerative medicine. These AFSCs are characterized by lower diversity, a higher proliferation rate, and a wider differentiation capability than adult mesenchymal stem cells. AFSCs are selected based on the cell surface marker c-kit receptor (CD117) using immunomagnetic sorting. Glial cell line-derived neurotrophic factor (GDNF) is expressed during early kidney development and regulates the proliferation and differentiation of stem cells in vitro. In this study, c-kit-sorted AFSCs were induced toward osteogenic or adipogenic differentiation. AFSCs engineered via the insertion of GDNF were cocultured with mouse renal tubular epithelial cells (mRTECs), which were preconditioned by hypoxia-reoxygenation in vitro. After coculture for 8 days, AFSCs differentiation into epithelial-like cells was evaluated by performing immunofluorescence, flow cytometry, and quantitative real-time polymerase chain reaction to identify cells expressing the renal epithelial markers, cytokeratin 18 (CK18), E-cadherin, aquaporin-1 (AQP1), and paired box 2 gene (Pax2). The GDNF gene enhanced AFSCs differentiation into RTECs. AFSCs possess self-renewal ability and multiple differentiation potential and thus represent a new source of stem cells.

  5. ClusterMine360: a database of microbial PKS/NRPS biosynthesis

    PubMed Central

    Conway, Kyle R.; Boddy, Christopher N.

    2013-01-01

    ClusterMine360 (http://www.clustermine360.ca/) is a database of microbial polyketide and non-ribosomal peptide gene clusters. It takes advantage of crowd-sourcing by allowing members of the community to make contributions while automation is used to help achieve high data consistency and quality. The database currently has >200 gene clusters from >185 compound families. It also features a unique sequence repository containing >10 000 polyketide synthase/non-ribosomal peptide synthetase domains. The sequences are filterable and downloadable as individual or multiple sequence FASTA files. We are confident that this database will be a useful resource for members of the polyketide synthases/non-ribosomal peptide synthetases research community, enabling them to keep up with the growing number of sequenced gene clusters and rapidly mine these clusters for functional information. PMID:23104377

  6. How exaptations facilitated photosensory evolution: Seeing the light by accident.

    PubMed

    Gavelis, Gregory S; Keeling, Patrick J; Leander, Brian S

    2017-07-01

    Exaptations are adaptations that have undergone a major change in function. By recruiting genes from sources originally unrelated to vision, exaptation has allowed for sudden and critical photosensory innovations, such as lenses, photopigments, and photoreceptors. Here we review new or neglected findings, with an emphasis on unicellular eukaryotes (protists), to illustrate how exaptation has shaped photoreception across the tree of life. Protist phylogeny attests to multiple origins of photoreception, as well as the extreme creativity of evolution. By appropriating genes and even entire organelles from foreign organisms via lateral gene transfer and endosymbiosis, protists have cobbled photoreceptors and eyespots from a diverse set of ingredients. While refinement through natural selection is paramount, exaptation helps illustrate how novelties arise in the first place, and is now shedding light on the origins of photoreception itself. © 2017 WILEY Periodicals, Inc.

  7. BubbleGUM: automatic extraction of phenotype molecular signatures and comprehensive visualization of multiple Gene Set Enrichment Analyses.

    PubMed

    Spinelli, Lionel; Carpentier, Sabrina; Montañana Sanchis, Frédéric; Dalod, Marc; Vu Manh, Thien-Phong

    2015-10-19

    Recent advances in the analysis of high-throughput expression data have led to the development of tools that scaled-up their focus from single-gene to gene set level. For example, the popular Gene Set Enrichment Analysis (GSEA) algorithm can detect moderate but coordinated expression changes of groups of presumably related genes between pairs of experimental conditions. This considerably improves extraction of information from high-throughput gene expression data. However, although many gene sets covering a large panel of biological fields are available in public databases, the ability to generate home-made gene sets relevant to one's biological question is crucial but remains a substantial challenge to most biologists lacking statistic or bioinformatic expertise. This is all the more the case when attempting to define a gene set specific of one condition compared to many other ones. Thus, there is a crucial need for an easy-to-use software for generation of relevant home-made gene sets from complex datasets, their use in GSEA, and the correction of the results when applied to multiple comparisons of many experimental conditions. We developed BubbleGUM (GSEA Unlimited Map), a tool that allows to automatically extract molecular signatures from transcriptomic data and perform exhaustive GSEA with multiple testing correction. One original feature of BubbleGUM notably resides in its capacity to integrate and compare numerous GSEA results into an easy-to-grasp graphical representation. We applied our method to generate transcriptomic fingerprints for murine cell types and to assess their enrichments in human cell types. This analysis allowed us to confirm homologies between mouse and human immunocytes. BubbleGUM is an open-source software that allows to automatically generate molecular signatures out of complex expression datasets and to assess directly their enrichment by GSEA on independent datasets. Enrichments are displayed in a graphical output that helps interpreting the results. This innovative methodology has recently been used to answer important questions in functional genomics, such as the degree of similarities between microarray datasets from different laboratories or with different experimental models or clinical cohorts. BubbleGUM is executable through an intuitive interface so that both bioinformaticians and biologists can use it. It is available at http://www.ciml.univ-mrs.fr/applications/BubbleGUM/index.html .

  8. Multiple ice-binding proteins of probable prokaryotic origin in an Antarctic lake alga, Chlamydomonas sp. ICE-MDV (Chlorophyceae).

    PubMed

    Raymond, James A; Morgan-Kiss, Rachael

    2017-08-01

    Ice-associated algae produce ice-binding proteins (IBPs) to prevent freezing damage. The IBPs of the three chlorophytes that have been examined so far share little similarity across species, making it likely that they were acquired by horizontal gene transfer (HGT). To clarify the importance and source of IBPs in chlorophytes, we sequenced the IBP genes of another Antarctic chlorophyte, Chlamydomonas sp. ICE-MDV (Chlamy-ICE). Genomic DNA and total RNA were sequenced and screened for known ice-associated genes. Chlamy-ICE has as many as 50 IBP isoforms, indicating that they have an important role in survival. The IBPs are of the DUF3494 type and have similar exon structures. The DUF3494 sequences are much more closely related to prokaryotic sequences than they are to sequences in other chlorophytes, and the chlorophyte IBP and ribosomal 18S phylogenies are dissimilar. The multiple IBP isoforms found in Chlamy-ICE and other algae may allow the algae to adapt to a greater variety of ice conditions than prokaryotes, which typically have a single IBP gene. The predicted structure of the DUF3494 domain has an ice-binding face with an orderly array of hydrophilic side chains. The results indicate that Chlamy-ICE acquired its IBP genes by HGT in a single event. The acquisitions of IBP genes by this and other species of Antarctic algae by HGT appear to be key evolutionary events that allowed algae to extend their ranges into polar environments. © 2017 Phycological Society of America.

  9. Pichia pastoris regulates its gene-specific response to different carbon sources at the transcriptional, rather than the translational, level.

    PubMed

    Prielhofer, Roland; Cartwright, Stephanie P; Graf, Alexandra B; Valli, Minoska; Bill, Roslyn M; Mattanovich, Diethard; Gasser, Brigitte

    2015-03-11

    The methylotrophic, Crabtree-negative yeast Pichia pastoris is widely used as a heterologous protein production host. Strong inducible promoters derived from methanol utilization genes or constitutive glycolytic promoters are typically used to drive gene expression. Notably, genes involved in methanol utilization are not only repressed by the presence of glucose, but also by glycerol. This unusual regulatory behavior prompted us to study the regulation of carbon substrate utilization in different bioprocess conditions on a genome wide scale. We performed microarray analysis on the total mRNA population as well as mRNA that had been fractionated according to ribosome occupancy. Translationally quiescent mRNAs were defined as being associated with single ribosomes (monosomes) and highly-translated mRNAs with multiple ribosomes (polysomes). We found that despite their lower growth rates, global translation was most active in methanol-grown P. pastoris cells, followed by excess glycerol- or glucose-grown cells. Transcript-specific translational responses were found to be minimal, while extensive transcriptional regulation was observed for cells grown on different carbon sources. Due to their respiratory metabolism, cells grown in excess glucose or glycerol had very similar expression profiles. Genes subject to glucose repression were mainly involved in the metabolism of alternative carbon sources including the control of glycerol uptake and metabolism. Peroxisomal and methanol utilization genes were confirmed to be subject to carbon substrate repression in excess glucose or glycerol, but were found to be strongly de-repressed in limiting glucose-conditions (as are often applied in fed batch cultivations) in addition to induction by methanol. P. pastoris cells grown in excess glycerol or glucose have similar transcript profiles in contrast to S. cerevisiae cells, in which the transcriptional response to these carbon sources is very different. The main response to different growth conditions in P. pastoris is transcriptional; translational regulation was not transcript-specific. The high proportion of mRNAs associated with polysomes in methanol-grown cells is a major finding of this study; it reveals that high productivity during methanol induction is directly linked to the growth condition and not only to promoter strength.

  10. Numbers of genes in the NBS and RLK families vary by more than four-fold within a plant species and are regulated by multiple factors.

    PubMed

    Zhang, Meiping; Wu, Yen-Hsuan; Lee, Mi-Kyung; Liu, Yun-Hua; Rong, Ying; Santos, Teofila S; Wu, Chengcang; Xie, Fangming; Nelson, Randall L; Zhang, Hong-Bin

    2010-10-01

    Many genes exist in the form of families; however, little is known about their size variation, evolution and biology. Here, we present the size variation and evolution of the nucleotide-binding site (NBS)-encoding gene family and receptor-like kinase (RLK) gene family in Oryza, Glycine and Gossypium. The sizes of both families vary by numeral fold, not only among species, surprisingly, also within a species. The size variations of the gene families are shown to correlate with each other, indicating their interactions, and driven by natural selection, artificial selection and genome size variation, but likely not by polyploidization. The numbers of genes in the families in a polyploid species are similar to those of one of its diploid donors, suggesting that polyploidization plays little roles in the expansion of the gene families and that organisms tend not to maintain their 'surplus' genes in the course of evolution. Furthermore, it is found that the size variations of both gene families are associated with organisms' phylogeny, suggesting their roles in speciation and evolution. Since both selection and speciation act on organism's morphological, physiological and biological variation, our results indicate that the variation of gene family size provides a source of genetic variation and evolution.

  11. High occurrence of Helicobacter pylori in raw goat, sheep and cow milk inferred by glmM gene: a risk of food-borne infection?

    PubMed

    Quaglia, N C; Dambrosio, A; Normanno, G; Parisi, A; Patrono, R; Ranieri, G; Rella, A; Celano, G V

    2008-05-10

    Helicobacter pylori is an organism widespread in humans and sometimes responsible for serious illnesses, such as gastric and duodenal ulcers, MALToma and even gastric cancer. It has been hypothesized that the infection route by H. pylori involves multiple pathways including food-borne transmission, as the microorganism has been detected from foods such as sheep and cow milk. This work reports the results of a survey conducted in order to investigate the presence of H. pylori in raw goat, sheep and cow milk produced in Southern Italy, employing a Nested Polymerase Chain Reaction (Nested-PCR) assay for the detection of the phosphoglucosamine mutase gene (glmM), as screening method followed by conventional bacteriological isolation. Out of the 400 raw milk samples examined, 139 (34.7%) resulted positive for the presence of glmM gene, but no strains were isolated. In this work H. pylori DNA has been firstly detected from 41 (25.6%) raw goat milk samples. The results deserve further investigations on the contamination source/s of the milk samples and on the major impact that it may have on consumers.

  12. The aldehyde dehydrogenase, AldA, is essential for L-1,2-propanediol utilization in laboratory-evolved Escherichia coli.

    PubMed

    Aziz, Ramy K; Monk, Jonathan M; Andrews, Kathleen A; Nhan, Jenny; Khaw, Valerie L; Wong, Hesper; Palsson, Bernhard O; Charusanti, Pep

    2017-01-01

    Most Escherichia coli strains are naturally unable to grow on 1,2-propanediol (PDO) as a sole carbon source. Recently, however, a K-12 descendent E. coli strain was evolved to grow on 1,2-PDO, and it was hypothesized that this evolved ability was dependent on the aldehyde dehydrogenase, AldA, which is highly conserved among members of the family Enterobacteriacea. To test this hypothesis, we first performed computational model simulation, which confirmed the essentiality of the aldA gene for 1,2-PDO utilization by the evolved PDO-degrading E. coli. Next, we deleted the aldA gene from the evolved strain, and this deletion was sufficient to abolish the evolved phenotype. On re-introducing the gene on a plasmid, the evolved phenotype was restored. These findings provide experimental evidence for the computationally predicted role of AldA in 1,2-PDO utilization, and represent a good example of E. coli robustness, demonstrated by the bacterial deployment of a generalist enzyme (here AldA) in multiple pathways to survive carbon starvation and to grow on a non-native substrate when no native carbon source is available. Copyright © 2016 Elsevier GmbH. All rights reserved.

  13. Prevalence of Antibiotic-Resistant Escherichia coli in Drinking Water Sources in Hangzhou City

    PubMed Central

    Chen, Zhaojun; Yu, Daojun; He, Songzhe; Ye, Hui; Zhang, Lei; Wen, Yanping; Zhang, Wenhui; Shu, Liping; Chen, Shuchang

    2017-01-01

    This study investigated the distribution of antibiotic resistant Escherichia coli (E. coli) and examined the possible relationship between water quality parameters and antibiotic resistance from two different drinking water sources (the Qiantang River and the Dongtiao Stream) in Hangzhou city of China. E. coli isolates were tested for their susceptibility to 18 antibiotics. Most of the isolates were resistant to tetracycline (TE), followed by ampicillin (AM), piperacillin (PIP), trimethoprim/sulfamethoxazole (SXT), and chloramphenicol (C). The antibiotic resistance rate of E. coli isolates from two water sources was similar; For E. coli isolates from the Qiantang River, their antibiotic resistance rates decreased from up- to downstream. Seasonally, the dry and wet season had little impact on antibiotic resistance. Spearman's rank correlation revealed significant correlation between resistance to TE and phenicols or ciprofloxacin (CIP), as well as quinolones (ciprofloxacin and levofloxacin) and cephalosporins or gentamicin (GM). Pearson's chi-square tests found certain water parameters such as nutrient concentration were strongly associated with resistance to some of the antibiotics. In addition, tet genes were detected from all 82 TE-resistant E. coli isolates, and most of the isolates (81.87%) contained multiple tet genes, which displayed 14 different combinations. Collectively, this study provided baseline data on antibiotic resistance of drinking water sources in Hangzhou city, which indicates drinking water sources could be the reservoir of antibiotic resistance, potentially presenting a public health risk. PMID:28670309

  14. Prevalence of Antibiotic-Resistant Escherichia coli in Drinking Water Sources in Hangzhou City.

    PubMed

    Chen, Zhaojun; Yu, Daojun; He, Songzhe; Ye, Hui; Zhang, Lei; Wen, Yanping; Zhang, Wenhui; Shu, Liping; Chen, Shuchang

    2017-01-01

    This study investigated the distribution of antibiotic resistant Escherichia coli ( E. coli ) and examined the possible relationship between water quality parameters and antibiotic resistance from two different drinking water sources (the Qiantang River and the Dongtiao Stream) in Hangzhou city of China. E. coli isolates were tested for their susceptibility to 18 antibiotics. Most of the isolates were resistant to tetracycline (TE), followed by ampicillin (AM), piperacillin (PIP), trimethoprim/sulfamethoxazole (SXT), and chloramphenicol (C). The antibiotic resistance rate of E. coli isolates from two water sources was similar; For E. coli isolates from the Qiantang River, their antibiotic resistance rates decreased from up- to downstream. Seasonally, the dry and wet season had little impact on antibiotic resistance. Spearman's rank correlation revealed significant correlation between resistance to TE and phenicols or ciprofloxacin (CIP), as well as quinolones (ciprofloxacin and levofloxacin) and cephalosporins or gentamicin (GM). Pearson's chi-square tests found certain water parameters such as nutrient concentration were strongly associated with resistance to some of the antibiotics. In addition, tet genes were detected from all 82 TE-resistant E. coli isolates, and most of the isolates (81.87%) contained multiple tet genes, which displayed 14 different combinations. Collectively, this study provided baseline data on antibiotic resistance of drinking water sources in Hangzhou city, which indicates drinking water sources could be the reservoir of antibiotic resistance, potentially presenting a public health risk.

  15. Environmental Spread of New Delhi Metallo-β-Lactamase-1-Producing Multidrug-Resistant Bacteria in Dhaka, Bangladesh

    PubMed Central

    Islam, Moydul; Hasan, Rashedul; Hossain, M. Iqbal; Nabi, Ashikun; Rahman, Mahdia; Goessens, Wil H. F.; Endtz, Hubert P.; Faruque, Shah M.

    2017-01-01

    ABSTRACT Resistance to carbapenem antibiotics through the production of New Delhi metallo-β-lactamase-1 (NDM-1) constitutes an emerging challenge in the treatment of bacterial infections. To monitor the possible source of the spread of these organisms in Dhaka, Bangladesh, we conducted a comparative analysis of wastewater samples from hospital-adjacent areas (HAR) and from community areas (COM), as well as public tap water samples, for the occurrence and characteristics of NDM-1-producing bacteria. Of 72 HAR samples tested, 51 (71%) samples were positive for NDM-1-producing bacteria, as evidenced by phenotypic tests and the presence of the blaNDM-1 gene, compared to 5 of 41 (12.1%) samples from COM samples (P < 0.001). All tap water samples were negative for NDM-1-producing bacteria. Klebsiella pneumoniae (44%) was the predominant bacterial species among blaNDM-1-positive isolates, followed by Escherichia coli (29%), Acinetobacter spp. (15%), and Enterobacter spp. (9%). These bacteria were also positive for one or more other antibiotic resistance genes, including blaCTX-M-1 (80%), blaCTX-M-15 (63%), blaTEM (76%), blaSHV (33%), blaCMY-2 (16%), blaOXA-48-like (2%), blaOXA-1 (53%), and blaOXA-47-like (60%) genes. Around 40% of the isolates contained a qnr gene, while 50% had 16S rRNA methylase genes. The majority of isolates hosted multiple plasmids, and plasmids of 30 to 50 MDa carrying blaNDM-1 were self-transmissible. Our results highlight a number of issues related to the characteristics and source of spread of multidrug-resistant bacteria as a potential public health threat. In view of the existing practice of discharging untreated liquid waste into the environment, hospitals in Dhaka city contribute to the potential dissemination of NDM-1-producing bacteria into the community. IMPORTANCE Infections caused by carbapenemase-producing Enterobacteriaceae are extremely difficult to manage due to their marked resistance to a wide range of antibiotics. NDM-1 is the most recently described carbapenemase, and the blaNDM-1 gene, which encodes NDM-1, is located on self-transmissible plasmids that also carry a considerable number of other antibiotic resistance genes. The present study shows a high prevalence of NDM-1-producing organisms in the wastewater samples from hospital-adjacent areas as a potential source for the spread of these organisms to community areas in Dhaka, Bangladesh. The study also examines the characteristics of the isolates and their potential to horizontally transmit the resistance determinants. The significance of our research is in identifying the mode of spread of multiple-antibiotic-resistant organisms, which will allow the development of containment measures, leading to broader impacts in reducing their spread to the community. PMID:28526792

  16. Dinucleotide controlled null models for comparative RNA gene prediction.

    PubMed

    Gesell, Tanja; Washietl, Stefan

    2008-05-27

    Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak et al. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available. We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content. SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered. SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: http://sourceforge.net/projects/sissiz.

  17. Action of multiple intra-QTL genes concerted around a co-localized transcription factor underpins a large effect QTL

    PubMed Central

    Dixit, Shalabh; Kumar Biswal, Akshaya; Min, Aye; Henry, Amelia; Oane, Rowena H.; Raorane, Manish L.; Longkumer, Toshisangba; Pabuayon, Isaiah M.; Mutte, Sumanth K.; Vardarajan, Adithi R.; Miro, Berta; Govindan, Ganesan; Albano-Enriquez, Blesilda; Pueffeld, Mandy; Sreenivasulu, Nese; Slamet-Loedin, Inez; Sundarvelpandian, Kalaipandian; Tsai, Yuan-Ching; Raghuvanshi, Saurabh; Hsing, Yue-Ie C.; Kumar, Arvind; Kohli, Ajay

    2015-01-01

    Sub-QTLs and multiple intra-QTL genes are hypothesized to underpin large-effect QTLs. Known QTLs over gene families, biosynthetic pathways or certain traits represent functional gene-clusters of genes of the same gene ontology (GO). Gene-clusters containing genes of different GO have not been elaborated, except in silico as coexpressed genes within QTLs. Here we demonstrate the requirement of multiple intra-QTL genes for the full impact of QTL qDTY12.1 on rice yield under drought. Multiple evidences are presented for the need of the transcription factor ‘no apical meristem’ (OsNAM12.1) and its co-localized target genes of separate GO categories for qDTY12.1 function, raising a regulon-like model of genetic architecture. The molecular underpinnings of qDTY12.1 support its effectiveness in further improving a drought tolerant genotype and for its validity in multiple genotypes/ecosystems/environments. Resolving the combinatorial value of OsNAM12.1 with individual intra-QTL genes notwithstanding, identification and analyses of qDTY12.1has fast-tracked rice improvement towards food security. PMID:26507552

  18. Isolation and identification of multidrug-resistant Staphylococcus haemolyticus from a laboratory-breeding mouse.

    PubMed

    Huang, Fengying; Meng, Qiuping; Tan, Guanghong; Huang, Yonghao; Wang, Hua; Mei, Wenli; Dai, Haofu

    2011-06-01

    To analysis and identify a bacterium strain isolated from laboratory breeding mouse far away from a hospital. Phenotype of the isolate was investigated by conventional microbiological methods, including Gram-staining, colony morphology, tests for haemolysis, catalase, coagulase, and antimicrobial susceptibility test. The mecA and 16S rRNA genes were amplified by the polymerase chain reaction (PCR) and sequenced. The base sequence of the PCR product was compared with known 16S rRNA gene sequences in the GenBank database by phylogenetic analysis and multiple sequence alignment. The isolate in this study was a gram positive, coagulase negative, and catalase positive coccus. The isolate was resistant to oxacillin, methicillin, penicillin, ampicillin, cefazolin, ciprofloxacin erythromycin, et al. PCR results indicated that the isolate was mecA gene positive and its 16S rRNA was 1 465 bp. Phylogenetic analysis of the resultant 16S rRNA indicated the isolate belonged to genus Saphylococcus, and multiple sequence alignment showed that the isolate was Saphylococcus haemolyticus with only one base difference from the corresponding 16S rRNA deposited in the GenBank. 16S rRNA gene sequencing is a suitable technique for non-specialist researchers. Laboratory animals are possible sources of lethal pathogens, and researchers must adapt protective measures when they manipulate animals. Copyright © 2011 Hainan Medical College. Published by Elsevier B.V. All rights reserved.

  19. Multiple homologous genes knockout (KO) by CRISPR/Cas9 system in rabbit.

    PubMed

    Liu, Huan; Sui, Tingting; Liu, Di; Liu, Tingjun; Chen, Mao; Deng, Jichao; Xu, Yuanyuan; Li, Zhanjun

    2018-03-20

    The CRISPR/Cas9 system is a highly efficient and convenient genome editing tool, which has been widely used for single or multiple gene mutation in a variety of organisms. Disruption of multiple homologous genes, which have similar DNA sequences and gene function, is required for the study of the desired phenotype. In this study, to test whether the CRISPR/Cas9 system works on the mutation of multiple homologous genes, a single guide RNA (sgRNA) targeting three fucosyltransferases encoding genes (FUT1, FUT2 and SEC1) was designed. As expected, triple gene mutation of FUT1, FUT2 and SEC1 could be achieved simultaneously via a sgRNA mediated CRISPR/Cas9 system. Besides, significantly reduced serum fucosyltransferases enzymes activity was also determined in those triple gene mutation rabbits. Thus, we provide the first evidence that multiple homologous genes knockout (KO) could be achieved efficiently by a sgRNA mediated CRISPR/Cas9 system in mammals, which could facilitate the genotype to phenotype studies of homologous genes in future. Copyright © 2018 Elsevier B.V. All rights reserved.

  20. Contamination with bacterial zoonotic pathogen genes in U.S. streams influenced by varying types of animal agriculture.

    PubMed

    Haack, Sheridan K; Duris, Joseph W; Kolpin, Dana W; Focazio, Michael J; Meyer, Michael T; Johnson, Heather E; Oster, Ryan J; Foreman, William T

    2016-09-01

    Animal waste, stream water, and streambed sediment from 19 small (<32km(2)) watersheds in 12U.S. states having either no major animal agriculture (control, n=4), or predominantly beef (n=4), dairy (n=3), swine (n=5), or poultry (n=3) were tested for: 1) cholesterol, coprostanol, estrone, and fecal indicator bacteria (FIB) concentrations, and 2) shiga-toxin producing and enterotoxigenic Escherichia coli, Salmonella, Campylobacter, and pathogenic and vancomycin-resistant enterococci by polymerase chain reaction (PCR) on enrichments, and/or direct quantitative PCR. Pathogen genes were most frequently detected in dairy wastes, followed by beef, swine and poultry wastes in that order; there was only one detection of an animal-source-specific pathogen gene (stx1) in any water or sediment sample in any control watershed. Post-rainfall pathogen gene numbers in stream water were significantly correlated with FIB, cholesterol and coprostanol concentrations, and were most highly correlated in dairy watershed samples collected from 3 different states. Although collected across multiple states and ecoregions, animal-waste gene profiles were distinctive via discriminant analysis. Stream water gene profiles could also be discriminated by the watershed animal type. Although pathogen genes were not abundant in stream water or streambed samples, PCR on enrichments indicated that many genes were from viable organisms, including several (shiga-toxin producing or enterotoxigenic E. coli, Salmonella, vancomycin-resistant enterococci) that could potentially affect either human or animal health. Pathogen gene numbers and types in stream water samples were influenced most by animal type, by local factors such as whether animals had stream access, and by the amount of local rainfall, and not by studied watershed soil or physical characteristics. Our results indicated that stream water in small agricultural U.S. watersheds was susceptible to pathogen gene inputs under typical agricultural practices and environmental conditions. Pathogen gene profiles may offer the potential to address both source of, and risks associated with, fecal pollution. Published by Elsevier B.V.

  1. Bloodmeal Identification in Field-Collected Sand Flies From Casa Branca, Brazil, Using the Cytochrome b PCR Method.

    PubMed

    Carvalho, G M L; Rêgo, F D; Tanure, A; Silva, A C P; Dias, T A; Paz, G F; Andrade Filho, J D

    2017-07-01

    PCR-based identification of vertebrate host bloodmeals has been performed on several vectors species with success. In the present study, we used a previously published PCR protocol followed by DNA sequencing based on primers designed from multiple alignments of the mitochondrial cytochrome b gene used to identify avian and mammalian hosts of various hematophagous vectors. The amplification of a fragment encoding a 359 bp sequence of the Cyt b gene yielded recognized amplification products in 192 female sand flies (53%), from a total of 362 females analyzed. In the study area of Casa Branca, Brazil, blood-engorged female sand flies such as Lutzomyia longipalpis (Lutz & Neiva, 1912), Migonemyia migonei (França, 1924), and Nyssomyia whitmani (Antunes & Coutinho, 1939) were analyzed for bloodmeal sources. The PCR-based method identified human, dog, chicken, and domestic rat blood sources. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  2. Development of a gene synthesis platform for the efficient large scale production of small genes encoding animal toxins.

    PubMed

    Sequeira, Ana Filipa; Brás, Joana L A; Guerreiro, Catarina I P D; Vincentelli, Renaud; Fontes, Carlos M G A

    2016-12-01

    Gene synthesis is becoming an important tool in many fields of recombinant DNA technology, including recombinant protein production. De novo gene synthesis is quickly replacing the classical cloning and mutagenesis procedures and allows generating nucleic acids for which no template is available. In addition, when coupled with efficient gene design algorithms that optimize codon usage, it leads to high levels of recombinant protein expression. Here, we describe the development of an optimized gene synthesis platform that was applied to the large scale production of small genes encoding venom peptides. This improved gene synthesis method uses a PCR-based protocol to assemble synthetic DNA from pools of overlapping oligonucleotides and was developed to synthesise multiples genes simultaneously. This technology incorporates an accurate, automated and cost effective ligation independent cloning step to directly integrate the synthetic genes into an effective Escherichia coli expression vector. The robustness of this technology to generate large libraries of dozens to thousands of synthetic nucleic acids was demonstrated through the parallel and simultaneous synthesis of 96 genes encoding animal toxins. An automated platform was developed for the large-scale synthesis of small genes encoding eukaryotic toxins. Large scale recombinant expression of synthetic genes encoding eukaryotic toxins will allow exploring the extraordinary potency and pharmacological diversity of animal venoms, an increasingly valuable but unexplored source of lead molecules for drug discovery.

  3. UNCLES: method for the identification of genes differentially consistently co-expressed in a specific subset of datasets.

    PubMed

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J; Nandi, Asoke K

    2015-06-04

    Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.

  4. Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods

    PubMed Central

    2012-01-01

    High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods. Reviewers This article was reviewed by Arcady Mushegian, Byung-Soo Kim and Joel Bader. PMID:23227854

  5. A deep transcriptomic resource for the copepod crustacean Labidocera madurae: A potential indicator species for assessing near shore ecosystem health

    PubMed Central

    Christie, Andrew E.; Sommer, Stephanie A.; Cieslak, Matthew C.; Hartline, Daniel K.; Lenz, Petra H.

    2017-01-01

    Coral reef ecosystems of many sub-tropical and tropical marine coastal environments have suffered significant degradation from anthropogenic sources. Research to inform management strategies that mitigate stressors and promote a healthy ecosystem has focused on the ecology and physiology of coral reefs and associated organisms. Few studies focus on the surrounding pelagic communities, which are equally important to ecosystem function. Zooplankton, often dominated by small crustaceans such as copepods, is an important food source for invertebrates and fishes, especially larval fishes. The reef-associated zooplankton includes a sub-neustonic copepod family that could serve as an indicator species for the community. Here, we describe the generation of a de novo transcriptome for one such copepod, Labidocera madurae, a pontellid from an intensively-studied coral reef ecosystem, Kāne‘ohe Bay, Oahu, Hawai‘i. The transcriptome was assembled using high-throughput sequence data obtained from whole organisms. It comprised 211,002 unique transcripts, including 72,391 with coding regions. It was assessed for quality and completeness using multiple workflows. Bench-marking-universal-single-copy-orthologs (BUSCO) analysis identified transcripts for 88% of expected eukaryotic core proteins. Targeted gene-discovery analyses included searches for transcripts coding full-length “giant” proteins (>4,000 amino acids), proteins and splice variants of voltage-gated sodium channels, and proteins involved in the circadian signaling pathway. Four different reference transcriptomes were generated and compared for the detection of differential gene expression between copepodites and adult females; 6,229 genes were consistently identified as differentially expressed between the two regardless of reference. Automated bioinformatics analyses and targeted manual gene curation suggest that the de novo assembled L. madurae transcriptome is of high quality and completeness. This transcriptome provides a new resource for assessing the global physiological status of a planktonic species inhabiting a coral reef ecosystem that is subjected to multiple anthropogenic stressors. The workflows provide a template for generating and assessing transcriptomes in other non-model species. PMID:29065152

  6. A deep transcriptomic resource for the copepod crustacean Labidocera madurae: A potential indicator species for assessing near shore ecosystem health.

    PubMed

    Roncalli, Vittoria; Christie, Andrew E; Sommer, Stephanie A; Cieslak, Matthew C; Hartline, Daniel K; Lenz, Petra H

    2017-01-01

    Coral reef ecosystems of many sub-tropical and tropical marine coastal environments have suffered significant degradation from anthropogenic sources. Research to inform management strategies that mitigate stressors and promote a healthy ecosystem has focused on the ecology and physiology of coral reefs and associated organisms. Few studies focus on the surrounding pelagic communities, which are equally important to ecosystem function. Zooplankton, often dominated by small crustaceans such as copepods, is an important food source for invertebrates and fishes, especially larval fishes. The reef-associated zooplankton includes a sub-neustonic copepod family that could serve as an indicator species for the community. Here, we describe the generation of a de novo transcriptome for one such copepod, Labidocera madurae, a pontellid from an intensively-studied coral reef ecosystem, Kāne'ohe Bay, Oahu, Hawai'i. The transcriptome was assembled using high-throughput sequence data obtained from whole organisms. It comprised 211,002 unique transcripts, including 72,391 with coding regions. It was assessed for quality and completeness using multiple workflows. Bench-marking-universal-single-copy-orthologs (BUSCO) analysis identified transcripts for 88% of expected eukaryotic core proteins. Targeted gene-discovery analyses included searches for transcripts coding full-length "giant" proteins (>4,000 amino acids), proteins and splice variants of voltage-gated sodium channels, and proteins involved in the circadian signaling pathway. Four different reference transcriptomes were generated and compared for the detection of differential gene expression between copepodites and adult females; 6,229 genes were consistently identified as differentially expressed between the two regardless of reference. Automated bioinformatics analyses and targeted manual gene curation suggest that the de novo assembled L. madurae transcriptome is of high quality and completeness. This transcriptome provides a new resource for assessing the global physiological status of a planktonic species inhabiting a coral reef ecosystem that is subjected to multiple anthropogenic stressors. The workflows provide a template for generating and assessing transcriptomes in other non-model species.

  7. Validation of reference genes for quantitative expression analysis by real-time RT-PCR in Saccharomyces cerevisiae

    PubMed Central

    Teste, Marie-Ange; Duquenne, Manon; François, Jean M; Parrou, Jean-Luc

    2009-01-01

    Background Real-time RT-PCR is the recommended method for quantitative gene expression analysis. A compulsory step is the selection of good reference genes for normalization. A few genes often referred to as HouseKeeping Genes (HSK), such as ACT1, RDN18 or PDA1 are among the most commonly used, as their expression is assumed to remain unchanged over a wide range of conditions. Since this assumption is very unlikely, a geometric averaging of multiple, carefully selected internal control genes is now strongly recommended for normalization to avoid this problem of expression variation of single reference genes. The aim of this work was to search for a set of reference genes for reliable gene expression analysis in Saccharomyces cerevisiae. Results From public microarray datasets, we selected potential reference genes whose expression remained apparently invariable during long-term growth on glucose. Using the algorithm geNorm, ALG9, TAF10, TFC1 and UBC6 turned out to be genes whose expression remained stable, independent of the growth conditions and the strain backgrounds tested in this study. We then showed that the geometric averaging of any subset of three genes among the six most stable genes resulted in very similar normalized data, which contrasted with inconsistent results among various biological samples when the normalization was performed with ACT1. Normalization with multiple selected genes was therefore applied to transcriptional analysis of genes involved in glycogen metabolism. We determined an induction ratio of 100-fold for GPH1 and 20-fold for GSY2 between the exponential phase and the diauxic shift on glucose. There was no induction of these two genes at this transition phase on galactose, although in both cases, the kinetics of glycogen accumulation was similar. In contrast, SGA1 expression was independent of the carbon source and increased by 3-fold in stationary phase. Conclusion In this work, we provided a set of genes that are suitable reference genes for quantitative gene expression analysis by real-time RT-PCR in yeast biological samples covering a large panel of physiological states. In contrast, we invalidated and discourage the use of ACT1 as well as other commonly used reference genes (PDA1, TDH3, RDN18, etc) as internal controls for quantitative gene expression analysis in yeast. PMID:19874630

  8. Validation of reference genes for quantitative expression analysis by real-time RT-PCR in Saccharomyces cerevisiae.

    PubMed

    Teste, Marie-Ange; Duquenne, Manon; François, Jean M; Parrou, Jean-Luc

    2009-10-30

    Real-time RT-PCR is the recommended method for quantitative gene expression analysis. A compulsory step is the selection of good reference genes for normalization. A few genes often referred to as HouseKeeping Genes (HSK), such as ACT1, RDN18 or PDA1 are among the most commonly used, as their expression is assumed to remain unchanged over a wide range of conditions. Since this assumption is very unlikely, a geometric averaging of multiple, carefully selected internal control genes is now strongly recommended for normalization to avoid this problem of expression variation of single reference genes. The aim of this work was to search for a set of reference genes for reliable gene expression analysis in Saccharomyces cerevisiae. From public microarray datasets, we selected potential reference genes whose expression remained apparently invariable during long-term growth on glucose. Using the algorithm geNorm, ALG9, TAF10, TFC1 and UBC6 turned out to be genes whose expression remained stable, independent of the growth conditions and the strain backgrounds tested in this study. We then showed that the geometric averaging of any subset of three genes among the six most stable genes resulted in very similar normalized data, which contrasted with inconsistent results among various biological samples when the normalization was performed with ACT1. Normalization with multiple selected genes was therefore applied to transcriptional analysis of genes involved in glycogen metabolism. We determined an induction ratio of 100-fold for GPH1 and 20-fold for GSY2 between the exponential phase and the diauxic shift on glucose. There was no induction of these two genes at this transition phase on galactose, although in both cases, the kinetics of glycogen accumulation was similar. In contrast, SGA1 expression was independent of the carbon source and increased by 3-fold in stationary phase. In this work, we provided a set of genes that are suitable reference genes for quantitative gene expression analysis by real-time RT-PCR in yeast biological samples covering a large panel of physiological states. In contrast, we invalidated and discourage the use of ACT1 as well as other commonly used reference genes (PDA1, TDH3, RDN18, etc) as internal controls for quantitative gene expression analysis in yeast.

  9. CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.

    PubMed

    Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H

    2010-07-06

    The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/

  10. Passing messages between biological networks to refine predicted interactions.

    PubMed

    Glass, Kimberly; Huttenhower, Curtis; Quackenbush, John; Yuan, Guo-Cheng

    2013-01-01

    Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net.

  11. Interbreeding among deeply divergent mitochondrial lineages in the American cockroach (Periplaneta americana)

    NASA Astrophysics Data System (ADS)

    von Beeren, Christoph; Stoeckle, Mark Y.; Xia, Joyce; Burke, Griffin; Kronauer, Daniel J. C.

    2015-02-01

    DNA barcoding promises to be a useful tool to identify pest species assuming adequate representation of genetic variants in a reference library. Here we examined mitochondrial DNA barcodes in a global urban pest, the American cockroach (Periplaneta americana). Our sampling effort generated 284 cockroach specimens, most from New York City, plus 15 additional U.S. states and six other countries, enabling the first large-scale survey of P. americana barcode variation. Periplaneta americana barcode sequences (n = 247, including 24 GenBank records) formed a monophyletic lineage separate from other Periplaneta species. We found three distinct P. americana haplogroups with relatively small differences within (<=0.6%) and larger differences among groups (2.4%-4.7%). This could be interpreted as indicative of multiple cryptic species. However, nuclear DNA sequences (n = 77 specimens) revealed extensive gene flow among mitochondrial haplogroups, confirming a single species. This unusual genetic pattern likely reflects multiple introductions from genetically divergent source populations, followed by interbreeding in the invasive range. Our findings highlight the need for comprehensive reference databases in DNA barcoding studies, especially when dealing with invasive populations that might be derived from multiple genetically distinct source populations.

  12. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development.

    PubMed

    Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-11-16

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.

  13. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development

    PubMed Central

    Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-01-01

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968

  14. Finding approximate gene clusters with Gecko 3.

    PubMed

    Winter, Sascha; Jahn, Katharina; Wehner, Stefanie; Kuchenbecker, Leon; Marz, Manja; Stoye, Jens; Böcker, Sebastian

    2016-11-16

    Gene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Development of a Knowledgebase (MetRxn) of Metabolites, Reactions and Atom Mappings to Accelerate Discovery and Redesign

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maranas, Costas D.

    With advances in DNA sequencing and genome annotation techniques, the breadth of metabolic knowledge across all kingdoms of life is increasing. The construction of genome-scale models (GSMs) facilitates this distillation of knowledge by systematically accounting for reaction stoichiometry and directionality, gene to protein to reaction relationships, reaction localization among cellular organelles, metabolite transport costs and routes, transcriptional regulation, and biomass composition. Genome-scale reconstructions available now span across all kingdoms of life, from microbes to whole-plant models, and have become indispensable for driving informed metabolic designs and interventions. A key barrier to the pace of this development is our inability tomore » utilize metabolite/reaction information from databases such as BRENDA [1], KEGG [2], MetaCyc [3], etc. due to incompatibilities of representation, duplications, and errors. Duplicate entries constitute a major impediment, where the same metabolite is found with multiple names across databases and models, which significantly slows downs the collating of information from multiple data sources. This can also lead to serious modeling errors such as charge/mass imbalances [4,5] which can thwart model predictive abilities such as identifying synthetic lethal gene pairs and quantifying metabolic flows. Hence, we created the MetRxn database [6] that takes the next step in integrating data from multiple sources and formats to automatically create a standardized knowledgebase. We subsequently deployed this resource to bring about new paradigms in genome-scale metabolic model reconstruction, metabolic flux elucidation through MFA, modeling of microbial communities, and pathway prospecting. This research has enabled the PI’s group to continue building upon research milestones and reach new ones (see list of MetRxn-related publications below).« less

  16. Nitrogen assimilation in denitrifier Bacillus azotoformans LMG 9581T.

    PubMed

    Sun, Yihua; De Vos, Paul; Willems, Anne

    2017-12-01

    Until recently, it has not been generally known that some bacteria can contain the gene inventory for both denitrification and dissimilatory nitrate (NO 3 - )/nitrite (NO 2 - ) reduction to ammonium (NH 4 + ) (DNRA). Detailed studies of these microorganisms could shed light on the differentiating environmental drivers of both processes without interference of organism-specific variation. Genome analysis of Bacillus azotoformans LMG 9581 T shows a remarkable redundancy of dissimilatory nitrogen reduction, with multiple copies of each denitrification gene as well as DNRA genes nrfAH, but a reduced capacity for nitrogen assimilation, with no nas operon nor amtB gene. Here, we explored nitrogen assimilation in detail using growth experiments in media with different organic and inorganic nitrogen sources at different concentrations. Monitoring of growth, NO 3 - NO 2 - , NH 4 + concentration and N 2 O production revealed that B. azotoformans LMG 9581 T could not grow with NH 4 + as sole nitrogen source and confirmed the hypothesis of reduced nitrogen assimilation pathways. However, NH 4 + could be assimilated and contributed up to 50% of biomass if yeast extract was also provided. NH 4 + also had a significant but concentration-dependent influence on growth rate. The mechanisms behind these observations remain to be resolved but hypotheses for this deficiency in nitrogen assimilation are discussed. In addition, in all growth conditions tested a denitrification phenotype was observed, with all supplied NO 3 - converted to nitrous oxide (N 2 O).

  17. Gene regulation and noise reduction by coupling of stochastic processes

    NASA Astrophysics Data System (ADS)

    Ramos, Alexandre F.; Hornos, José Eduardo M.; Reinitz, John

    2015-02-01

    Here we characterize the low-noise regime of a stochastic model for a negative self-regulating binary gene. The model has two stochastic variables, the protein number and the state of the gene. Each state of the gene behaves as a protein source governed by a Poisson process. The coupling between the two gene states depends on protein number. This fact has a very important implication: There exist protein production regimes characterized by sub-Poissonian noise because of negative covariance between the two stochastic variables of the model. Hence the protein numbers obey a probability distribution that has a peak that is sharper than those of the two coupled Poisson processes that are combined to produce it. Biochemically, the noise reduction in protein number occurs when the switching of the genetic state is more rapid than protein synthesis or degradation. We consider the chemical reaction rates necessary for Poisson and sub-Poisson processes in prokaryotes and eucaryotes. Our results suggest that the coupling of multiple stochastic processes in a negative covariance regime might be a widespread mechanism for noise reduction.

  18. Gene regulation and noise reduction by coupling of stochastic processes

    PubMed Central

    Hornos, José Eduardo M.; Reinitz, John

    2015-01-01

    Here we characterize the low noise regime of a stochastic model for a negative self-regulating binary gene. The model has two stochastic variables, the protein number and the state of the gene. Each state of the gene behaves as a protein source governed by a Poisson process. The coupling between the the two gene states depends on protein number. This fact has a very important implication: there exist protein production regimes characterized by sub-Poissonian noise because of negative covariance between the two stochastic variables of the model. Hence the protein numbers obey a probability distribution that has a peak that is sharper than those of the two coupled Poisson processes that are combined to produce it. Biochemically, the noise reduction in protein number occurs when the switching of genetic state is more rapid than protein synthesis or degradation. We consider the chemical reaction rates necessary for Poisson and sub-Poisson processes in prokaryotes and eucaryotes. Our results suggest that the coupling of multiple stochastic processes in a negative covariance regime might be a widespread mechanism for noise reduction. PMID:25768447

  19. Gene regulation and noise reduction by coupling of stochastic processes.

    PubMed

    Ramos, Alexandre F; Hornos, José Eduardo M; Reinitz, John

    2015-02-01

    Here we characterize the low-noise regime of a stochastic model for a negative self-regulating binary gene. The model has two stochastic variables, the protein number and the state of the gene. Each state of the gene behaves as a protein source governed by a Poisson process. The coupling between the two gene states depends on protein number. This fact has a very important implication: There exist protein production regimes characterized by sub-Poissonian noise because of negative covariance between the two stochastic variables of the model. Hence the protein numbers obey a probability distribution that has a peak that is sharper than those of the two coupled Poisson processes that are combined to produce it. Biochemically, the noise reduction in protein number occurs when the switching of the genetic state is more rapid than protein synthesis or degradation. We consider the chemical reaction rates necessary for Poisson and sub-Poisson processes in prokaryotes and eucaryotes. Our results suggest that the coupling of multiple stochastic processes in a negative covariance regime might be a widespread mechanism for noise reduction.

  20. Synthetic Gene Network with Positive Feedback Loop Amplifies Cellulase Gene Expression in Neurospora crassa.

    PubMed

    Matsu-Ura, Toru; Dovzhenok, Andrey A; Coradetti, Samuel T; Subramanian, Krithika R; Meyer, Daniel R; Kwon, Jaesang J; Kim, Caleb; Salomonis, Nathan; Glass, N Louise; Lim, Sookkyung; Hong, Christian I

    2018-05-18

    Second-generation or lignocellulosic biofuels are a tangible source of renewable energy, which is critical to combat climate change by reducing the carbon footprint. Filamentous fungi secrete cellulose-degrading enzymes called cellulases, which are used for production of lignocellulosic biofuels. However, inefficient production of cellulases is a major obstacle for industrial-scale production of second-generation biofuels. We used computational simulations to design and implement synthetic positive feedback loops to increase gene expression of a key transcription factor, CLR-2, that activates a large number of cellulases in a filamentous fungus, Neurospora crassa. Overexpression of CLR-2 reveals previously unappreciated roles of CLR-2 in lignocellulosic gene network, which enabled simultaneous induction of approximately 50% of 78 lignocellulosic degradation-related genes in our engineered Neurospora strains. This engineering results in dramatically increased cellulase activity due to cooperative orchestration of multiple enzymes involved in the cellulose degradation pathway. Our work provides a proof of principle in utilizing mathematical modeling and synthetic biology to improve the efficiency of cellulase synthesis for second-generation biofuel production.

  1. Dose-sensitivity, conserved non-coding sequences, and duplicate gene retention through multiple tetraploidies in the grasses.

    PubMed

    Schnable, James C; Pedersen, Brent S; Subramaniam, Sabarinath; Freeling, Michael

    2011-01-01

    Whole genome duplications, or tetraploidies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein-protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein-protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved noncoding sequences (CNSs) associated with genes predicts the likelihood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelihood of gene retention following tetraploidy may also be influenced by dose-sensitive protein-DNA interactions between the regulatory regions of CNS-rich genes - nicknamed bigfoot genes - and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pre-grass tetraploidy reduces its chance of retention in the subsequent maize lineage tetraploidy.

  2. Dose–Sensitivity, Conserved Non-Coding Sequences, and Duplicate Gene Retention Through Multiple Tetraploidies in the Grasses

    PubMed Central

    Schnable, James C.; Pedersen, Brent S.; Subramaniam, Sabarinath; Freeling, Michael

    2011-01-01

    Whole genome duplications, or tetraploidies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein–protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein–protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved noncoding sequences (CNSs) associated with genes predicts the likelihood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelihood of gene retention following tetraploidy may also be influenced by dose–sensitive protein–DNA interactions between the regulatory regions of CNS-rich genes – nicknamed bigfoot genes – and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pre-grass tetraploidy reduces its chance of retention in the subsequent maize lineage tetraploidy. PMID:22645525

  3. Seminal SIV in chronically-infected cynomolgus macaques is dominated by virus originating from multiple genital organs.

    PubMed

    Houzet, Laurent; Pérez-Losada, Marcos; Matusali, Giulia; Deleage, Claire; Dereuddre-Bosquet, Nathalie; Satie, Anne-Pascale; Aubry, Florence; Becker, Emmanuelle; Jégou, Bernard; Le Grand, Roger; Keele, Brandon F; Crandall, Keith A; Dejucq-Rainsford, Nathalie

    2018-05-02

    The sexual transmission of viruses is responsible for the spread of multiple infectious diseases. Although the HIV/AIDS pandemic remains fueled by sexual contacts with infected semen, the origin of virus in semen is still unknown. In a substantial number of HIV- infected men, viral strains present in semen differ from the ones in blood, suggesting that HIV is locally produced within the genital tract. Such local production may be responsible for the persistence of HIV in semen despite effective antiretroviral therapy. Here we use single genome amplification, amplicon sequencing ( env gene) and phylogenetic analyses to compare the genetic structure of SIV populations across all the male genital organs and blood in intravenously inoculated cynomolgus macaques in the chronic stage of infection. Examination of the virus populations present in the male genital tissues of the macaques revealed compartmentalized SIV populations in testis, epididymis, vas deferens, seminal vesicles and urethra. We found genetic similarities between the viral strains present in semen and those in epididymis, vas deferens and seminal vesicles. The contribution of male genital organs to virus shedding in semen varied among individuals and could not be predicted based on their infection or pro-inflammatory cytokine mRNA levels. These data indicate that, rather than a single source, multiple genital organs are involved in the release of free virus and infected cells into semen. These findings have important implications for our understanding of systemic virus shedding and persistence in semen and for the design of eradication strategies to access viral reservoirs. IMPORTANCE Semen is instrumental for the dissemination of viruses through sexual contacts. Worryingly, a number of systemic viruses such as HIV can persist in this body fluid in the absence of viremia. The local source(s) of virus in semen, however, remain unknown. To elucidate the anatomic origin(s) of the virus released in semen, we compared viral populations present in semen with those in the male genital organs and blood of the Asian macaque model, using single genome amplification, amplicon sequencing ( env gene) and phylogenetic analysis. Our results show that multiple genital tissues harbor compartmentalized strains, some of them (i.e. epididymis, vas deferens and seminal vesicle) displaying genetic similarities with the viral populations present in semen. This study is the first to uncover local genital sources of viral populations in semen, providing a new basis for innovative targeted strategies to prevent and eradicate HIV in the male genital tract. Copyright © 2018 Houzet et al.

  4. The role of interindividual variation in human carcinogenesis.

    PubMed

    Lai, C; Shields, P G

    1999-02-01

    The process of chemical carcinogenesis is a complex multistage process initiated by DNA damage in growth control genes. Carcinogens enter the body from a variety of sources, but most require metabolic activation before they can damage DNA. There are multiple protective processes that include detoxification and conjugation, DNA repair and programmed cell death. Most of these functions exhibit wide interindividual variation in the population and thus are thought to affect cancer risk. The role of gene-environment interactions is being explored, and current data indicate that genetic susceptibilities can modify carcinogen exposures from the diet and tobacco smoking, although much more data exist for the latter. This review addresses the relationships of human carcinogenesis to these interindividual differences of phase I, phase II and DNA repair enzymes.

  5. Semantic Web Ontology and Data Integration: a Case Study in Aiding Psychiatric Drug Repurposing.

    PubMed

    Liang, Chen; Sun, Jingchun; Tao, Cui

    2015-01-01

    There remain significant difficulties selecting probable candidate drugs from existing databases. We describe an ontology-oriented approach to represent the nexus between genes, drugs, phenotypes, symptoms, and diseases from multiple information sources. We also report a case study in which we attempted to explore candidate drugs effective for bipolar disorder and epilepsy. We constructed an ontology incorporating knowledge between the two diseases and performed semantic reasoning tasks with the ontology. The results suggested 48 candidate drugs that hold promise for further breakthrough. The evaluation demonstrated the validity our approach. Our approach prioritizes the candidate drugs that have potential associations among genes, phenotypes and symptoms, and thus facilitates the data integration and drug repurposing in psychiatric disorders.

  6. A genetics-based approach confirms immune associations with life history across multiple populations of an aquatic vertebrate (Gasterosteus aculeatus).

    PubMed

    Whiting, James R; Magalhaes, Isabel S; Singkam, Abdul R; Robertson, Shaun; D'Agostino, Daniele; Bradley, Janette E; MacColl, Andrew D C

    2018-06-20

    Understanding how wild immune variation covaries with other traits can reveal how costs and trade-offs shape immune evolution in the wild. Divergent life history strategies may increase or alleviate immune costs, helping shape immune variation in a consistent, testable way. Contrasting hypotheses suggest that shorter life histories may alleviate costs by offsetting them against increased mortality; or increase the effect of costs if immune responses are traded off against development or reproduction. We investigated the evolutionary relationship between life history and immune responses within an island radiation of three-spined stickleback, with discrete populations of varying life histories and parasitism. We sampled two short-lived, two long-lived and an anadromous population using qPCR to quantify current immune profile and RAD-seq data to study the distribution of immune variants within our assay genes and across the genome. Short-lived populations exhibited significantly increased expression of all assay genes, which was accompanied by a strong association with population-level variation in local alleles and divergence in a gene that may be involved in complement pathways. In addition, divergence around the eda gene in anadromous fish is likely associated with increased inflammation. A wider analysis of 15 populations across the island revealed that immune genes across the genome show evidence of having diverged alongside life history strategies. Parasitism and reproductive investment were also important sources of variation for expression, highlighting the caution required when assaying immune responses in the wild. These results provide strong, gene-based support for current hypotheses linking life history and immune variation across multiple populations of a vertebrate model. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  7. Toolbox Approaches Using Molecular Markers and 16S rRNA Gene Amplicon Data Sets for Identification of Fecal Pollution in Surface Water

    PubMed Central

    Staley, C.; Sadowsky, M. J.; Gyawali, P.; Sidhu, J. P. S.; Palmer, A.; Beale, D. J.; Toze, S.

    2015-01-01

    In this study, host-associated molecular markers and bacterial 16S rRNA gene community analysis using high-throughput sequencing were used to identify the sources of fecal pollution in environmental waters in Brisbane, Australia. A total of 92 fecal and composite wastewater samples were collected from different host groups (cat, cattle, dog, horse, human, and kangaroo), and 18 water samples were collected from six sites (BR1 to BR6) along the Brisbane River in Queensland, Australia. Bacterial communities in the fecal, wastewater, and river water samples were sequenced. Water samples were also tested for the presence of bird-associated (GFD), cattle-associated (CowM3), horse-associated, and human-associated (HF183) molecular markers, to provide multiple lines of evidence regarding the possible presence of fecal pollution associated with specific hosts. Among the 18 water samples tested, 83%, 33%, 17%, and 17% were real-time PCR positive for the GFD, HF183, CowM3, and horse markers, respectively. Among the potential sources of fecal pollution in water samples from the river, DNA sequencing tended to show relatively small contributions from wastewater treatment plants (up to 13% of sequence reads). Contributions from other animal sources were rarely detected and were very small (<3% of sequence reads). Source contributions determined via sequence analysis versus detection of molecular markers showed variable agreement. A lack of relationships among fecal indicator bacteria, host-associated molecular markers, and 16S rRNA gene community analysis data was also observed. Nonetheless, we show that bacterial community and host-associated molecular marker analyses can be combined to identify potential sources of fecal pollution in an urban river. This study is a proof of concept, and based on the results, we recommend using bacterial community analysis (where possible) along with PCR detection or quantification of host-associated molecular markers to provide information on the sources of fecal pollution in waterways. PMID:26231650

  8. Tumor-derived exosomes regulate expression of immune function-related genes in human T cell subsets.

    PubMed

    Muller, Laurent; Mitsuhashi, Masato; Simms, Patricia; Gooding, William E; Whiteside, Theresa L

    2016-02-04

    Tumor cell-derived exosomes (TEX) suppress functions of immune cells. Here, changes in the gene profiles of primary human T lymphocytes exposed in vitro to exosomes were evaluated. CD4(+) Tconv, CD8(+) T or CD4(+) CD39(+) Treg were isolated from normal donors' peripheral blood and co-incubated with TEX or exosomes isolated from supernatants of cultured dendritic cells (DEX). Expression levels of 24-27 immune response-related genes in these T cells were quantified by qRT-PCR. In activated T cells, TEX and DEX up-regulated mRNA expression levels of multiple genes. Multifactorial data analysis of ΔCt values identified T cell activation and the immune cell type, but not exosome source, as factors regulating gene expression by exosomes. Treg were more sensitive to TEX-mediated effects than other T cell subsets. In Treg, TEX-mediated down-regulation of genes regulating the adenosine pathway translated into high expression of CD39 and increased adenosine production. TEX also induced up-regulation of inhibitory genes in CD4(+) Tconv, which translated into a loss of CD69 on their surface and a functional decline. Exosomes are not internalized by T cells, but signals they carry and deliver to cell surface receptors modulate gene expression and functions of human T lymphocytes.

  9. Theory and methodology for utilizing genes as biomarkers to determine potential biological mixtures.

    PubMed

    Shrestha, Sadeep; Smith, Michael W; Beaty, Terri H; Strathdee, Steffanie A

    2005-01-01

    Genetically determined mixture information can be used as a surrogate for physical or behavioral characteristics in epidemiological studies examining research questions related to socially stigmatized behaviors and horizontally transmitted infections. A new measure, the probability of mixture discrimination (PMD), was developed to aid mixture analysis that estimates the ability to differentiate single from multiple genomes in biological mixtures. Four autosomal short tandem repeats (STRs) were identified, genotyped and evaluated in African American, European American, Hispanic, and Chinese individuals to estimate PMD. Theoretical PMD frameworks were also developed for autosomal and sex-linked (X and Y) STR markers in potential male/male, male/female and female/female mixtures. Autosomal STRs genetically determine the presence of multiple genomes in mixture samples of unknown genders with more power than the apparently simpler X and Y chromosome STRs. Evaluation of four autosomal STR loci enables the detection of mixtures of DNA from multiple sources with above 99% probability in all four racial/ethnic populations. The genetic-based approach has applications in epidemiology that provide viable alternatives to survey-based study designs. The analysis of genes as biomarkers can be used as a gold standard for validating measurements from self-reported behaviors that tend to be sensitive or socially stigmatizing, such as those involving sex and drugs.

  10. The Core and Accessory Genomes of Burkholderia pseudomallei: Implications for Human Melioidosis

    PubMed Central

    Lin, Chi Ho; Karuturi, R. Krishna M.; Wuthiekanun, Vanaporn; Tuanyok, Apichai; Chua, Hui Hoon; Ong, Catherine; Paramalingam, Sivalingam Suppiah; Tan, Gladys; Tang, Lynn; Lau, Gary; Ooi, Eng Eong; Woods, Donald; Feil, Edward; Peacock, Sharon J.; Tan, Patrick

    2008-01-01

    Natural isolates of Burkholderia pseudomallei (Bp), the causative agent of melioidosis, can exhibit significant ecological flexibility that is likely reflective of a dynamic genome. Using whole-genome Bp microarrays, we examined patterns of gene presence and absence across 94 South East Asian strains isolated from a variety of clinical, environmental, or animal sources. 86% of the Bp K96243 reference genome was common to all the strains representing the Bp “core genome”, comprising genes largely involved in essential functions (eg amino acid metabolism, protein translation). In contrast, 14% of the K96243 genome was variably present across the isolates. This Bp accessory genome encompassed multiple genomic islands (GIs), paralogous genes, and insertions/deletions, including three distinct lipopolysaccharide (LPS)-related gene clusters. Strikingly, strains recovered from cases of human melioidosis clustered on a tree based on accessory gene content, and were significantly more likely to harbor certain GIs compared to animal and environmental isolates. Consistent with the inference that the GIs may contribute to pathogenesis, experimental mutation of BPSS2053, a GI gene, reduced microbial adherence to human epithelial cells. Our results suggest that the Bp accessory genome is likely to play an important role in microbial adaptation and virulence. PMID:18927621

  11. The Use of Amino Sugars by Bacillus subtilis: Presence of a Unique Operon for the Catabolism of Glucosamine

    PubMed Central

    Gaugué, Isabelle; Oberto, Jacques; Putzer, Harald; Plumbridge, Jacqueline

    2013-01-01

    B. subtilis grows more rapidly using the amino sugar glucosamine as carbon source, than with N-acetylglucosamine. Genes for the transport and metabolism of N-acetylglucosamine (nagP and nagAB) are found in all the sequenced Bacilli (except Anoxybacillus flavithermus). In B. subtilis there is an additional operon (gamAP) encoding second copies of genes for the transport and catabolism of glucosamine. We have developed a method to make multiple deletion mutations in B. subtilis employing an excisable spectinomycin resistance cassette. Using this method we have analysed the contribution of the different genes of the nag and gam operons for their role in utilization of glucosamine and N-acetylglucosamine. Faster growth on glucosamine is due to the presence of the gamAP operon, which is strongly induced by glucosamine. Although the gamA and nagB genes encode isozymes of GlcN6P deaminase, catabolism of N-acetylglucosamine relies mostly upon the gamA gene product. The genes for use of N-acetylglucosamine, nagAB and nagP, are repressed by YvoA (NagR), a GntR family regulator, whose gene is part of the nagAB yvoA(nagR) operon. The gamAP operon is repressed by YbgA, another GntR family repressor, whose gene is expressed divergently from gamAP. The nagAB yvoA synton is found throughout the Bacilli and most firmicutes. On the other hand the ybgA-gamAP synton, which includes the ybgB gene for a small protein of unknown provenance, is only found in B. subtilis (and a few very close relatives). The origin of ybgBA-gamAP grouping is unknown but synteny analysis suggests lateral transfer from an unidentified donor. The presence of gamAP has enabled B. subtilis to efficiently use glucosamine as carbon source. PMID:23667565

  12. Genomic Analysis of the Kiwifruit Pathogen Pseudomonas syringae pv. actinidiae Provides Insight into the Origins of an Emergent Plant Disease

    PubMed Central

    McCann, Honour C.; Rikkerink, Erik H. A.; Bertels, Frederic; Fiers, Mark; Lu, Ashley; Rees-George, Jonathan; Andersen, Mark T.; Gleave, Andrew P.; Haubold, Bernhard; Wohlers, Mark W.; Guttman, David S.; Wang, Pauline W.; Straub, Christina; Vanneste, Joel; Rainey, Paul B.; Templeton, Matthew D.

    2013-01-01

    The origins of crop diseases are linked to domestication of plants. Most crops were domesticated centuries – even millennia – ago, thus limiting opportunity to understand the concomitant emergence of disease. Kiwifruit (Actinidia spp.) is an exception: domestication began in the 1930s with outbreaks of canker disease caused by P. syringae pv. actinidiae (Psa) first recorded in the 1980s. Based on SNP analyses of two circularized and 34 draft genomes, we show that Psa is comprised of distinct clades exhibiting negligible within-clade diversity, consistent with disease arising by independent samplings from a source population. Three clades correspond to their geographical source of isolation; a fourth, encompassing the Psa-V lineage responsible for the 2008 outbreak, is now globally distributed. Psa has an overall clonal population structure, however, genomes carry a marked signature of within-pathovar recombination. SNP analysis of Psa-V reveals hundreds of polymorphisms; however, most reside within PPHGI-1-like conjugative elements whose evolution is unlinked to the core genome. Removal of SNPs due to recombination yields an uninformative (star-like) phylogeny consistent with diversification of Psa-V from a single clone within the last ten years. Growth assays provide evidence of cultivar specificity, with rapid systemic movement of Psa-V in Actinidia chinensis. Genomic comparisons show a dynamic genome with evidence of positive selection on type III effectors and other candidate virulence genes. Each clade has highly varied complements of accessory genes encoding effectors and toxins with evidence of gain and loss via multiple genetic routes. Genes with orthologs in vascular pathogens were found exclusively within Psa-V. Our analyses capture a pathogen in the early stages of emergence from a predicted source population associated with wild Actinidia species. In addition to candidate genes as targets for resistance breeding programs, our findings highlight the importance of the source population as a reservoir of new disease. PMID:23935484

  13. JRmGRN: Joint reconstruction of multiple gene regulatory networks with common hub genes using data from multiple tissues or conditions.

    PubMed

    Deng, Wenping; Zhang, Kui; Liu, Sanzhen; Zhao, Patrick; Xu, Shizhong; Wei, Hairong

    2018-04-30

    Joint reconstruction of multiple gene regulatory networks (GRNs) using gene expression data from multiple tissues/conditions is very important for understanding common and tissue/condition-specific regulation. However, there are currently no computational models and methods available for directly constructing such multiple GRNs that not only share some common hub genes but also possess tissue/condition-specific regulatory edges. In this paper, we proposed a new graphic Gaussian model for joint reconstruction of multiple gene regulatory networks (JRmGRN), which highlighted hub genes, using gene expression data from several tissues/conditions. Under the framework of Gaussian graphical model, JRmGRN method constructs the GRNs through maximizing a penalized log likelihood function. We formulated it as a convex optimization problem, and then solved it with an alternating direction method of multipliers (ADMM) algorithm. The performance of JRmGRN was first evaluated with synthetic data and the results showed that JRmGRN outperformed several other methods for reconstruction of GRNs. We also applied our method to real Arabidopsis thaliana RNA-seq data from two light regime conditions in comparison with other methods, and both common hub genes and some conditions-specific hub genes were identified with higher accuracy and precision. JRmGRN is available as a R program from: https://github.com/wenpingd. hairong@mtu.edu. Proof of theorem, derivation of algorithm and supplementary data are available at Bioinformatics online.

  14. Development of Real-Time PCR to Monitor Groundwater Contaminated by Fecal Sources and Leachate from the Carcass

    NASA Astrophysics Data System (ADS)

    Park, S.; Kim, H.; Kim, M.; Lee, Y.; Han, J.

    2011-12-01

    The 2010 outbreak of foot and mouth disease (FMD) in South Korea caused about 4,054 carcass burial sites to dispose the carcasses. Potential environmental impacts by leachate of carcass on groundwater have been issued and it still needs to be studied. Therefore, we tried to develop robust and sensitive tool to immediately determine a groundwater contamination by the leachate from carcass burial. For tracking both an agricultural fecal contamination source and the leachate in groundwater, competitive real-time PCR and PCR method were developed using various PCR primer sets designed to detect E. Coli uidA gene and mtDNA(cytochrome B, cytB) of the animal species such as ovine, porcine, caprine, and bovine. The designed methods were applied to tract the animal species in livestock wastewater and leachate of carcass under appropriate PCR or real-time PCR condition. In the result, mtDNA primer sets for individual (Cow or Pig) and multiple (Cow and Pig) amplification, and E. Coli uidA primers for fecal source amplification were specific and sensitive to target genes. To determine contamination source, concentration of amplified mtDNA and uidA was competitively quantified in Livestock wastewater, leachate of carcass, and groundwater. The highest concentration of mtDNA and uidA showed in leachate of carcass and livestock wastewater, respectively. Groundwater samples possibly contaminated by leachate of carcass were analyzed by this assay and it was able to prove contamination source.

  15. Multiple schwannomatosis caused by the recently described INI1 gene--molecular pathology, and implications for prognosis.

    PubMed

    Brennan, Paul M; Barlow, Antonio; Geraghty, Alistair; Summers, David; Fitzpatrick, Michael M

    2011-06-01

    The most common genetic predisposition to multiple schwannoma growth is mutation of the neurofibromatosis type 2 gene. We describe a patient with multiple schwannomas and mutation in the recently described INI1 gene, which also predisposes to the disease. We explore the implications for prognosis and outcome.

  16. Bioinformatics Knowledge Map for Analysis of Beta-Catenin Function in Cancer

    PubMed Central

    Arighi, Cecilia N.; Wu, Cathy H.

    2015-01-01

    Given the wealth of bioinformatics resources and the growing complexity of biological information, it is valuable to integrate data from disparate sources to gain insight into the role of genes/proteins in health and disease. We have developed a bioinformatics framework that combines literature mining with information from biomedical ontologies and curated databases to create knowledge “maps” of genes/proteins of interest. We applied this approach to the study of beta-catenin, a cell adhesion molecule and transcriptional regulator implicated in cancer. The knowledge map includes post-translational modifications (PTMs), protein-protein interactions, disease-associated mutations, and transcription factors co-activated by beta-catenin and their targets and captures the major processes in which beta-catenin is known to participate. Using the map, we generated testable hypotheses about beta-catenin biology in normal and cancer cells. By focusing on proteins participating in multiple relation types, we identified proteins that may participate in feedback loops regulating beta-catenin transcriptional activity. By combining multiple network relations with PTM proteoform-specific functional information, we proposed a mechanism to explain the observation that the cyclin dependent kinase CDK5 positively regulates beta-catenin co-activator activity. Finally, by overlaying cancer-associated mutation data with sequence features, we observed mutation patterns in several beta-catenin PTM sites and PTM enzyme binding sites that varied by tissue type, suggesting multiple mechanisms by which beta-catenin mutations can contribute to cancer. The approach described, which captures rich information for molecular species from genes and proteins to PTM proteoforms, is extensible to other proteins and their involvement in disease. PMID:26509276

  17. Diversity of Clostridium perfringens isolates from various sources and prevalence of conjugative plasmids.

    PubMed

    Park, Miseon; Deck, Joanna; Foley, Steven L; Nayak, Rajesh; Songer, J Glenn; Seibel, Janice R; Khan, Saeed A; Rooney, Alejandro P; Hecht, David W; Rafii, Fatemeh

    2016-04-01

    Clostridium perfringens is an important pathogen, causing food poisoning and other mild to severe infections in humans and animals. Some strains of C. perfringens contain conjugative plasmids, which may carry antimicrobial resistance and toxin genes. We studied genomic and plasmid diversity of 145 C. perfringens type A strains isolated from soils, foods, chickens, clinical samples, and domestic animals (porcine, bovine and canine), from different geographic areas in the United States between 1994 and 2006, using multiple-locus variable-number tandem repeat analysis (MLVA) and/or pulsed-field gel electrophoresis (PFGE). MLVA detected the genetic diversity in a majority of the isolates. PFGE, using SmaI and KspI, confirmed the MLVA results but also detected differences among the strains that could not be differentiated by MLVA. All of the PFGE profiles of the strains were different, except for a few of the epidemiologically related strains, which were identical. The PFGE profiles of strains isolated from the same domestic animal species were clustered more closely with each other than with other strains. However, a variety of C. perfringens strains with distinct genetic backgrounds were found among the clinical isolates. Variation was also observed in the size and number of plasmids in the strains. Primers for the internal fragment of a conjugative tcpH gene of C. perfringens plasmid pCPF4969 amplified identical size fragments from a majority of strains tested; and this gene hybridized to the various-sized plasmids of these strains. The sequences of the PCR-amplified tcpH genes from 12 strains showed diversity among the tcpH genes. Regardless of the sources of the isolates, the genetic diversity of C. perfringens extended to the plasmids carrying conjugative genes. Published by Elsevier Ltd.

  18. Towards the integration, annotation and association of historical microarray experiments with RNA-seq.

    PubMed

    Chavan, Shweta S; Bauer, Michael A; Peterson, Erich A; Heuck, Christoph J; Johann, Donald J

    2013-01-01

    Transcriptome analysis by microarrays has produced important advances in biomedicine. For instance in multiple myeloma (MM), microarray approaches led to the development of an effective disease subtyping via cluster assignment, and a 70 gene risk score. Both enabled an improved molecular understanding of MM, and have provided prognostic information for the purposes of clinical management. Many researchers are now transitioning to Next Generation Sequencing (NGS) approaches and RNA-seq in particular, due to its discovery-based nature, improved sensitivity, and dynamic range. Additionally, RNA-seq allows for the analysis of gene isoforms, splice variants, and novel gene fusions. Given the voluminous amounts of historical microarray data, there is now a need to associate and integrate microarray and RNA-seq data via advanced bioinformatic approaches. Custom software was developed following a model-view-controller (MVC) approach to integrate Affymetrix probe set-IDs, and gene annotation information from a variety of sources. The tool/approach employs an assortment of strategies to integrate, cross reference, and associate microarray and RNA-seq datasets. Output from a variety of transcriptome reconstruction and quantitation tools (e.g., Cufflinks) can be directly integrated, and/or associated with Affymetrix probe set data, as well as necessary gene identifiers and/or symbols from a diversity of sources. Strategies are employed to maximize the annotation and cross referencing process. Custom gene sets (e.g., MM 70 risk score (GEP-70)) can be specified, and the tool can be directly assimilated into an RNA-seq pipeline. A novel bioinformatic approach to aid in the facilitation of both annotation and association of historic microarray data, in conjunction with richer RNA-seq data, is now assisting with the study of MM cancer biology.

  19. Molecular study on some antibiotic resistant genes in Salmonella spp. isolates

    NASA Astrophysics Data System (ADS)

    Nabi, Ari Q.

    2017-09-01

    Studying the genes related with antimicrobial resistance in Salmonella spp. is a crucial step toward a correct and faster treatment of infections caused by the pathogen. In this work Integron mediated antibiotic resistant gene IntI1 (Class I Integrase IntI1) and some plasmid mediated antibiotic resistance genes (Qnr) were scanned among the isolated non-Typhoid Salmonellae strains with known resistance to some important antimicrobial drugs using Sybr Green real time PCR. The aim of the study was to correlate the multiple antibiotics and antimicrobial resistance of Salmonella spp. with the presence of integrase (IntI1) gene and plasmid mediated quinolone resistant genes. Results revealed the presence of Class I Integrase gene in 76% of the isolates with confirmed multiple antibiotic resistances. Moreover, about 32% of the multiple antibiotic resistant serotypes showed a positive R-PCR for plasmid mediated qnrA gene encoding for nalidixic acid and ciprofloxacin resistance. No positive results could be revealed form R-PCRs targeting qnrB or qnrS. In light of these results we can conclude that the presence of at least one of the qnr genes and/or the presence of Integrase Class I gene were responsible for the multiple antibiotic resistance to for nalidixic acid and ciprofloxacin from the studied Salmonella spp. and further studies required to identify the genes related with multiple antibiotic resistance of the pathogen.

  20. Riverbed Sediments as Reservoirs of Multiple Vibrio cholerae Virulence-Associated Genes: A Potential Trigger for Cholera Outbreaks in Developing Countries.

    PubMed

    Abia, Akebe Luther King; Ubomba-Jaswa, Eunice; Momba, Maggy Ndombo Benteke

    2017-01-01

    Africa remains the most cholera stricken continent in the world as many people lacking access to safe drinking water rely mostly on polluted rivers as their main water sources. However, studies in these countries investigating the presence of Vibrio cholerae in aquatic environments have paid little attention to bed sediments. Also, information on the presence of virulence-associated genes (VAGs) in environmental ctx -negative V. cholerae strains in this region is lacking. Thus, we investigated the presence of V. cholerae VAGs in water and riverbed sediment of the Apies River, South Africa. Altogether, 120 samples (60 water and 60 sediment samples) collected from ten sites on the river (January and February 2014) were analysed using PCR. Of the 120 samples, 37 sediment and 31 water samples were positive for at least one of the genes investigated. The haemolysin gene (hlyA) was the most isolated gene. The cholera toxin (ctxAB) and non-O1 heat-stable (stn/sto) genes were not detected. Genes were frequently detected at sites influenced by human activities. Thus, identification of V. cholerae VAGs in sediments suggests the possible presence of V. cholerae and identifies sediments of the Apies River as a reservoir for potentially pathogenic V. cholerae with possible public health implications.

  1. Structure of CARB-4 and AER-1 CarbenicillinHydrolyzing β-Lactamases

    PubMed Central

    Sanschagrin, François; Bejaoui, Noureddine; Levesque, Roger C.

    1998-01-01

    We determined the nucleotide sequences of blaCARB-4 encoding CARB-4 and deduced a polypeptide of 288 amino acids. The gene was characterized as a variant of group 2c carbenicillin-hydrolyzing β-lactamases such as PSE-4, PSE-1, and CARB-3. The level of DNA homology between the bla genes for these β-lactamases varied from 98.7 to 99.9%, while that between these genes and blaCARB-4 encoding CARB-4 was 86.3%. The blaCARB-4 gene was acquired from some other source because it has a G+C content of 39.1%, compared to a G+C content of 67% for typical Pseudomonas aeruginosa genes. DNA sequencing revealed that blaAER-1 shared 60.8% DNA identity with blaPSE-3 encoding PSE-3. The deduced AER-1 β-lactamase peptide was compared to class A, B, C, and D enzymes and had 57.6% identity with PSE-3, including an STHK tetrad at the active site. For CARB-4 and AER-1, conserved canonical amino acid boxes typical of class A β-lactamases were identified in a multiple alignment. Analysis of the DNA sequences flanking blaCARB-4 and blaAER-1 confirmed the importance of gene cassettes acquired via integrons in bla gene distribution. PMID:9687391

  2. WormQTLHD--a web database for linking human disease to natural variation data in C. elegans.

    PubMed

    van der Velde, K Joeri; de Haan, Mark; Zych, Konrad; Arends, Danny; Snoek, L Basten; Kammenga, Jan E; Jansen, Ritsert C; Swertz, Morris A; Li, Yang

    2014-01-01

    Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism-Caenorhabditis elegans-has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTL(HD) (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene-disease associations in man. WormQTL(HD), available online at http://www.wormqtl-hd.org, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene-disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench.

  3. Catabolite regulation analysis of Escherichia coli for acetate overflow mechanism and co-consumption of multiple sugars based on systems biology approach using computer simulation.

    PubMed

    Matsuoka, Yu; Shimizu, Kazuyuki

    2013-10-20

    It is quite important to understand the basic principle embedded in the main metabolism for the interpretation of the fermentation data. For this, it may be useful to understand the regulation mechanism based on systems biology approach. In the present study, we considered the perturbation analysis together with computer simulation based on the models which include the effects of global regulators on the pathway activation for the main metabolism of Escherichia coli. Main focus is the acetate overflow metabolism and the co-fermentation of multiple carbon sources. The perturbation analysis was first made to understand the nature of the feed-forward loop formed by the activation of Pyk by FDP (F1,6BP), and the feed-back loop formed by the inhibition of Pfk by PEP in the glycolysis. Those together with the effect of transcription factor Cra caused by FDP level affected the glycolysis activity. The PTS (phosphotransferase system) acts as the feed-back system by repressing the glucose uptake rate for the increase in the glucose uptake rate. It was also shown that the increased PTS flux (or glucose consumption rate) causes PEP/PYR ratio to be decreased, and EIIA-P, Cya, cAMP-Crp decreased, where cAMP-Crp in turn repressed TCA cycle and more acetate is formed. This was further verified by the detailed computer simulation. In the case of multiple carbon sources such as glucose and xylose, it was shown that the sequential utilization of carbon sources was observed for wild type, while the co-consumption of multiple carbon sources with slow consumption rates were observed for the ptsG mutant by computer simulation, and this was verified by experiments. Moreover, the effect of a specific gene knockout such as Δpyk on the metabolic characteristics was also investigated based on the computer simulation. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. Sphingomonas wittichii Strain RW1 Genome-Wide Gene Expression Shifts in Response to Dioxins and Clay

    PubMed Central

    Tsoi, Tamara V.; Iwai, Shoko; Liu, Cun; Fish, Jordan A.; Gu, Cheng; Johnson, Timothy A.; Zylstra, Gerben; Teppen, Brian J.; Li, Hui; Hashsham, Syed A.; Boyd, Stephen A.; Cole, James R.; Tiedje, James M.

    2016-01-01

    Sphingomonas wittichii strain RW1 (RW1) is one of the few strains that can grow on dibenzo-p-dioxin (DD). We conducted a transcriptomic study of RW1 using RNA-Seq to outline transcriptional responses to DD, dibenzofuran (DF), and the smectite clay mineral saponite with succinate as carbon source. The ability to grow on DD is rare compared to growth on the chemically similar DF even though the same initial dioxygenase may be involved in oxidation of both substrates. Therefore, we hypothesized the reason for this lies beyond catabolic pathways and may concern genes involved in processes for cell-substrate interactions such as substrate recognition, transport, and detoxification. Compared to succinate (SUC) as control carbon source, DF caused over 240 protein-coding genes to be differentially expressed, whereas more than 300 were differentially expressed with DD. Stress response genes were up-regulated in response to both DD and DF. This effect was stronger with DD than DF, suggesting a higher toxicity of DD compared to DF. Both DD and DF caused changes in expression of genes involved in active cross-membrane transport such as TonB-dependent receptor proteins, but the patterns of change differed between the two substrates. Multiple transcription factor genes also displayed expression patterns distinct to DD and DF growth. DD and DF induced the catechol ortho- and the salicylate/gentisate pathways, respectively. Both DD and DF induced the shared down-stream aliphatic intermediate compound pathway. Clay caused category-wide down-regulation of genes for cell motility and chemotaxis, particularly those involved in the synthesis, assembly and functioning of flagella. This is an environmentally important finding because clay is a major component of soil microbes’ microenvironment influencing local chemistry and may serve as a geosorbent for toxic pollutants. Similar to clay, DD and DF also affected expression of genes involved in motility and chemotaxis. PMID:27309357

  5. KaBOB: ontology-based semantic integration of biomedical databases.

    PubMed

    Livingston, Kevin M; Bada, Michael; Baumgartner, William A; Hunter, Lawrence E

    2015-04-23

    The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources. We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license. KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for formal reasoning over a wealth of integrated biomedical data.

  6. Individual crypt genetic heterogeneity and the origin of metaplastic glandular epithelium in human Barrett’s oesophagus

    PubMed Central

    Leedham, S J; Preston, S L; McDonald, S A C; Elia, G; Bhandari, P; Poller, D; Harrison, R; Novelli, M R; Jankowski, J A; Wright, N A

    2008-01-01

    Objectives: Current models of clonal expansion in human Barrett’s oesophagus are based upon heterogenous, flow-purified biopsy analysis taken at multiple segment levels. Detection of identical mutation fingerprints from these biopsy samples led to the proposal that a mutated clone with a selective advantage can clonally expand to fill an entire Barrett’s segment at the expense of competing clones (selective sweep to fixation model). We aimed to assess clonality at a much higher resolution by microdissecting and genetically analysing individual crypts. The histogenesis of Barrett’s metaplasia and neo-squamous islands has never been demonstrated. We investigated the oesophageal gland squamous ducts as the source of both epithelial sub-types. Methods: Individual crypts across Barrett’s biopsy and oesophagectomy blocks were dissected. Determination of tumour suppressor gene loss of heterozygosity patterns, p16 and p53 point mutations were carried out on a crypt-by-crypt basis. Cases of contiguous neo-squamous islands and columnar metaplasia with oesophageal squamous ducts were identified. Tissues were isolated by laser capture microdissection and genetically analysed. Results: Individual crypt dissection revealed mutation patterns that were masked in whole biopsy analysis. Dissection across oesophagectomy specimens demonstrated marked clonal heterogeneity, with multiple independent clones present. We identified a p16 point mutation arising in the squamous epithelium of the oesophageal gland duct, which was also present in a contiguous metaplastic crypt, whereas neo-squamous islands arising from squamous ducts were wild-type with respect to surrounding Barrett’s dysplasia. Conclusions: By studying clonality at the crypt level we demonstrate that Barrett’s heterogeneity arises from multiple independent clones, in contrast to the selective sweep to fixation model of clonal expansion previously described. We suggest that the squamous gland ducts situated throughout the oesophagus are the source of a progenitor cell that may be susceptible to gene mutation resulting in conversion to Barrett’s metaplastic epithelium. Additionally, these data suggest that wild-type ducts may be the source of neo-squamous islands. PMID:18305067

  7. deFUME: Dynamic exploration of functional metagenomic sequencing data.

    PubMed

    van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper; Malla, Sailesh; Sommer, Morten Otto Alexander

    2015-07-31

    Functional metagenomic selections represent a powerful technique that is widely applied for identification of novel genes from complex metagenomic sources. However, whereas hundreds to thousands of clones can be easily generated and sequenced over a few days of experiments, analyzing the data is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence to a comprehensive visual data overview that facilitates effortless inspection of gene function, clustering and distribution. The webserver is available at cbs.dtu.dk/services/deFUME/and the source code is distributed at github.com/EvdH0/deFUME.

  8. Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM)

    PubMed Central

    Skinnider, Michael A.; Dejong, Chris A.; Rees, Philip N.; Johnston, Chad W.; Li, Haoxin; Webster, Andrew L. H.; Wyatt, Morgan A.; Magarvey, Nathan A.

    2015-01-01

    Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. PMID:26442528

  9. Integrated Genomic Analysis of Diverse Induced Pluripotent Stem Cells from the Progenitor Cell Biology Consortium.

    PubMed

    Salomonis, Nathan; Dexheimer, Phillip J; Omberg, Larsson; Schroll, Robin; Bush, Stacy; Huo, Jeffrey; Schriml, Lynn; Ho Sui, Shannan; Keddache, Mehdi; Mayhew, Christopher; Shanmukhappa, Shiva Kumar; Wells, James; Daily, Kenneth; Hubler, Shane; Wang, Yuliang; Zambidis, Elias; Margolin, Adam; Hide, Winston; Hatzopoulos, Antonis K; Malik, Punam; Cancelas, Jose A; Aronow, Bruce J; Lutzko, Carolyn

    2016-07-12

    The rigorous characterization of distinct induced pluripotent stem cells (iPSC) derived from multiple reprogramming technologies, somatic sources, and donors is required to understand potential sources of variability and downstream potential. To achieve this goal, the Progenitor Cell Biology Consortium performed comprehensive experimental and genomic analyses of 58 iPSC from ten laboratories generated using a variety of reprogramming genes, vectors, and cells. Associated global molecular characterization studies identified functionally informative correlations in gene expression, DNA methylation, and/or copy-number variation among key developmental and oncogenic regulators as a result of donor, sex, line stability, reprogramming technology, and cell of origin. Furthermore, X-chromosome inactivation in PSC produced highly correlated differences in teratoma-lineage staining and regulator expression upon differentiation. All experimental results, and raw, processed, and metadata from these analyses, including powerful tools, are interactively accessible from a new online portal at https://www.synapse.org to serve as a reusable resource for the stem cell community. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  10. Contamination with bacterial zoonotic pathogen genes in U.S. streams influenced by varying types of animal agriculture

    USGS Publications Warehouse

    Haack, Sheridan K.; Duris, Joseph W.; Kolpin, Dana W.; Focazio, Michael J.; Meyer, Michael T.; Johnson, Heather E.; Oster, Ryan J.; Foreman, William T.

    2016-01-01

    Animal waste, stream water, and streambed sediment from 19 small (< 32 km2) watersheds in 12 U.S. states having either no major animal agriculture (control, n = 4), or predominantly beef (n = 4), dairy (n = 3), swine (n = 5), or poultry (n = 3) were tested for: 1) cholesterol, coprostanol, estrone, and fecal indicator bacteria (FIB) concentrations, and 2) shiga-toxin producing and enterotoxigenic Escherichia coli, Salmonella, Campylobacter, and pathogenic and vancomycin-resistant enterococci by polymerase chain reaction (PCR) on enrichments, and/or direct quantitative PCR. Pathogen genes were most frequently detected in dairy wastes, followed by beef, swine and poultry wastes in that order; there was only one detection of an animal-source-specific pathogen gene (stx1) in any water or sediment sample in any control watershed. Post-rainfall pathogen gene numbers in stream water were significantly correlated with FIB, cholesterol and coprostanol concentrations, and were most highly correlated in dairy watershed samples collected from 3 different states. Although collected across multiple states and ecoregions, animal-waste gene profiles were distinctive via discriminant analysis. Stream water gene profiles could also be discriminated by the watershed animal type. Although pathogen genes were not abundant in stream water or streambed samples, PCR on enrichments indicated that many genes were from viable organisms, including several (shiga-toxin producing or enterotoxigenic E. coli, Salmonella, vancomycin-resistant enterococci) that could potentially affect either human or animal health. Pathogen gene numbers and types in stream water samples were influenced most by animal type, by local factors such as whether animals had stream access, and by the amount of local rainfall, and not by studied watershed soil or physical characteristics. Our results indicated that stream water in small agricultural U.S. watersheds was susceptible to pathogen gene inputs under typical agricultural practices and environmental conditions. Pathogen gene profiles may offer the potential to address both source of, and risks associated with, fecal pollution.

  11. Prioritization of candidate disease genes by topological similarity between disease and protein diffusion profiles.

    PubMed

    Zhu, Jie; Qin, Yufang; Liu, Taigang; Wang, Jun; Zheng, Xiaoqi

    2013-01-01

    Identification of gene-phenotype relationships is a fundamental challenge in human health clinic. Based on the observation that genes causing the same or similar phenotypes tend to correlate with each other in the protein-protein interaction network, a lot of network-based approaches were proposed based on different underlying models. A recent comparative study showed that diffusion-based methods achieve the state-of-the-art predictive performance. In this paper, a new diffusion-based method was proposed to prioritize candidate disease genes. Diffusion profile of a disease was defined as the stationary distribution of candidate genes given a random walk with restart where similarities between phenotypes are incorporated. Then, candidate disease genes are prioritized by comparing their diffusion profiles with that of the disease. Finally, the effectiveness of our method was demonstrated through the leave-one-out cross-validation against control genes from artificial linkage intervals and randomly chosen genes. Comparative study showed that our method achieves improved performance compared to some classical diffusion-based methods. To further illustrate our method, we used our algorithm to predict new causing genes of 16 multifactorial diseases including Prostate cancer and Alzheimer's disease, and the top predictions were in good consistent with literature reports. Our study indicates that integration of multiple information sources, especially the phenotype similarity profile data, and introduction of global similarity measure between disease and gene diffusion profiles are helpful for prioritizing candidate disease genes. Programs and data are available upon request.

  12. Mouse androgenetic embryonic stem cells differentiated to multiple cell lineages in three embryonic germ layers in vitro.

    PubMed

    Teramura, Takeshi; Onodera, Yuta; Murakami, Hideki; Ito, Syunsuke; Mihara, Toshihiro; Takehara, Toshiyuki; Kato, Hiromi; Mitani, Tasuku; Anzai, Masayuki; Matsumoto, Kazuya; Saeki, Kazuhiro; Fukuda, Kanji; Sagawa, Norimasa; Osoi, Yoshihiko

    2009-06-01

    The embryos of some rodents and primates can precede early development without the process of fertilization; however, they cease to develop after implantation because of restricted expressions of imprinting genes. Asexually developed embryos are classified into parthenote/gynogenote and androgenote by their genomic origins. Embryonic stem cells (ESCs) derived from asexual origins have also been reported. To date, ESCs derived from parthenogenetic embryos (PgESCs) have been established in some species, including humans, and the possibility to be alternative sources for autologous cell transplantation in regenerative medicine has been proposed. However, some developmental characteristics, which might be important for therapeutic applications, such as multiple differentiation capacity and transplantability of the ESCs of androgenetic origin (AgESCs) are uncertain. Here, we induced differentiation of mouse AgESCs and observed derivation of neural cells, cardiomyocytes and hepatocytes in vitro. Following differentiated embryoid body (EB) transplantation in various mouse strains including the strain of origin, we found that the EBs could engraft in theoretically MHC-matched strains. Our results indicate that AgESCs possess at least two important characteristics, multiple differentiation properties in vitro and transplantability after differentiation, and suggest that they can also serve as a source of histocompatible tissues for transplantation.

  13. BIAS: Bioinformatics Integrated Application Software.

    PubMed

    Finak, G; Godin, N; Hallett, M; Pepin, F; Rajabi, Z; Srivastava, V; Tang, Z

    2005-04-15

    We introduce a development platform especially tailored to Bioinformatics research and software development. BIAS (Bioinformatics Integrated Application Software) provides the tools necessary for carrying out integrative Bioinformatics research requiring multiple datasets and analysis tools. It follows an object-relational strategy for providing persistent objects, allows third-party tools to be easily incorporated within the system and supports standards and data-exchange protocols common to Bioinformatics. BIAS is an OpenSource project and is freely available to all interested users at http://www.mcb.mcgill.ca/~bias/. This website also contains a paper containing a more detailed description of BIAS and a sample implementation of a Bayesian network approach for the simultaneous prediction of gene regulation events and of mRNA expression from combinations of gene regulation events. hallett@mcb.mcgill.ca.

  14. A computational genomics pipeline for prokaryotic sequencing projects.

    PubMed

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  15. Multiple Site-Directed and Saturation Mutagenesis by the Patch Cloning Method.

    PubMed

    Taniguchi, Naohiro; Murakami, Hiroshi

    2017-01-01

    Constructing protein-coding genes with desired mutations is a basic step for protein engineering. Herein, we describe a multiple site-directed and saturation mutagenesis method, termed MUPAC. This method has been used to introduce multiple site-directed mutations in the green fluorescent protein gene and in the moloney murine leukemia virus reverse transcriptase gene. Moreover, this method was also successfully used to introduce randomized codons at five desired positions in the green fluorescent protein gene, and for simple DNA assembly for cloning.

  16. Functional analysis of multiple carotenogenic genes from Lycium barbarum and Gentiana lutea L. for their effects on beta-carotene production in transgenic tobacco.

    PubMed

    Ji, Jing; Wang, Gang; Wang, Jiehua; Wang, Ping

    2009-02-01

    Carotenoids are red, yellow and orange pigments, which are widely distributed in nature and are especially abundant in yellow-orange fruits and vegetables and dark green leafy vegetables. Carotenoids are essential for photosynthesis and photoprotection in plant life and also have different beneficial effects in humans and animals (van den Berg et al. 2000). For example, beta-carotene plays an essential role as the main dietary source of vitamin A. To obtain further insight into beta-carotene biosynthesis in two important economic plant species, Lycium barbarum and Gentiana lutea L., and to investigate and prioritize potential genetic engineering targets in the pathway, the effects of five carotenogenic genes from these two species, encoding proteins including geranylgeranyl diphosphate synthase, phytoene synthase and delta-carotene desaturase gene, lycopene beta-cyclase, lycopene epsilon-cyclase were functionally analyzed in transgenic tobacco (Nicotiana tabacum) plants. All transgenic tobacco plants constitutively expressing these genes showed enhanced beta-carotene contents in their leaves and flowers to different extents. The addictive effects of co-ordinate expression of double transgenes have also been investigated.

  17. Integrating Genetic and Functional Genomic Data to Elucidate Common Disease Tra

    NASA Astrophysics Data System (ADS)

    Schadt, Eric

    2005-03-01

    The reconstruction of genetic networks in mammalian systems is one of the primary goals in biological research, especially as such reconstructions relate to elucidating not only common, polygenic human diseases, but living systems more generally. Here I present a statistical procedure for inferring causal relationships between gene expression traits and more classic clinical traits, including complex disease traits. This procedure has been generalized to the gene network reconstruction problem, where naturally occurring genetic variations in segregating mouse populations are used as a source of perturbations to elucidate tissue-specific gene networks. Differences in the extent of genetic control between genders and among four different tissues are highlighted. I also demonstrate that the networks derived from expression data in segregating mouse populations using the novel network reconstruction algorithm are able to capture causal associations between genes that result in increased predictive power, compared to more classically reconstructed networks derived from the same data. This approach to causal inference in large segregating mouse populations over multiple tissues not only elucidates fundamental aspects of transcriptional control, it also allows for the objective identification of key drivers of common human diseases.

  18. Genetic Structure and Gene Flows within Horses: A Genealogical Study at the French Population Scale

    PubMed Central

    Pirault, Pauline; Danvy, Sophy; Verrier, Etienne; Leroy, Grégoire

    2013-01-01

    Since horse breeds constitute populations submitted to variable and multiple outcrossing events, we analyzed the genetic structure and gene flows considering horses raised in France. We used genealogical data, with a reference population of 547,620 horses born in France between 2002 and 2011, grouped according to 55 breed origins. On average, individuals had 6.3 equivalent generations known. Considering different population levels, fixation index decreased from an overall species FIT of 1.37%, to an average of −0.07% when considering the 55 origins, showing that most horse breeds constitute populations without genetic structure. We illustrate the complexity of gene flows existing among horse breeds, a few populations being closed to foreign influence, most, however, being submitted to various levels of introgression. In particular, Thoroughbred and Arab breeds are largely used as introgression sources, since those two populations explain together 26% of founder origins within the overall horse population. When compared with molecular data, breeds with a small level of coancestry also showed low genetic distance; the gene pool of the breeds was probably impacted by their reproducer exchanges. PMID:23630596

  19. Simultaneous gene finding in multiple genomes.

    PubMed

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  20. Passing Messages between Biological Networks to Refine Predicted Interactions

    PubMed Central

    Glass, Kimberly; Huttenhower, Curtis; Quackenbush, John; Yuan, Guo-Cheng

    2013-01-01

    Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net. PMID:23741402

  1. Genexpi: a toolset for identifying regulons and validating gene regulatory networks using time-course expression data.

    PubMed

    Modrák, Martin; Vohradský, Jiří

    2018-04-13

    Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.

  2. Comparison of genome-wide selection strategies to identify furfural tolerance genes in Escherichia coli.

    PubMed

    Glebes, Tirzah Y; Sandoval, Nicholas R; Gillis, Jacob H; Gill, Ryan T

    2015-01-01

    Engineering both feedstock and product tolerance is important for transitioning towards next-generation biofuels derived from renewable sources. Tolerance to chemical inhibitors typically results in complex phenotypes, for which multiple genetic changes must often be made to confer tolerance. Here, we performed a genome-wide search for furfural-tolerant alleles using the TRackable Multiplex Recombineering (TRMR) method (Warner et al. (2010), Nature Biotechnology), which uses chromosomally integrated mutations directed towards increased or decreased expression of virtually every gene in Escherichia coli. We employed various growth selection strategies to assess the role of selection design towards growth enrichments. We also compared genes with increased fitness from our TRMR selection to those from a previously reported genome-wide identification study of furfural tolerance genes using a plasmid-based genomic library approach (Glebes et al. (2014) PLOS ONE). In several cases, growth improvements were observed for the chromosomally integrated promoter/RBS mutations but not for the plasmid-based overexpression constructs. Through this assessment, four novel tolerance genes, ahpC, yhjH, rna, and dicA, were identified and confirmed for their effect on improving growth in the presence of furfural. © 2014 Wiley Periodicals, Inc.

  3. Evaluation of atpB nucleotide sequences for phylogenetic studies of ferns and other pteridophytes.

    PubMed

    Wolf, P

    1997-10-01

    Inferring basal relationships among vascular plants poses a major challenge to plant systematists. The divergence events that describe these relationships occurred long ago and considerable homoplasy has since accrued for both molecular and morphological characters. A potential solution is to examine phylogenetic analyses from multiple data sets. Here I present a new source of phylogenetic data for ferns and other pteridophytes. I sequenced the chloroplast gene atpB from 23 pteridophyte taxa and used maximum parsimony to infer relationships. A 588-bp region of the gene appeared to contain a statistically significant amount of phylogenetic signal and the resulting trees were largely congruent with similar analyses of nucleotide sequences from rbcL. However, a combined analysis of atpB plus rbcL produced a better resolved tree than did either data set alone. In the shortest trees, leptosporangiate ferns formed a monophyletic group. Also, I detected a well-supported clade of Psilotaceae (Psilotum and Tmesipteris) plus Ophioglossaceae (Ophioglossum and Botrychium). The demonstrated utility of atpB suggests that sequences from this gene should play a role in phylogenetic analyses that incorporate data from chloroplast genes, nuclear genes, morphology, and fossil data.

  4. webMGR: an online tool for the multiple genome rearrangement problem.

    PubMed

    Lin, Chi Ho; Zhao, Hao; Lowcay, Sean Harry; Shahab, Atif; Bourque, Guillaume

    2010-02-01

    The algorithm MGR enables the reconstruction of rearrangement phylogenies based on gene or synteny block order in multiple genomes. Although MGR has been successfully applied to study the evolution of different sets of species, its utilization has been hampered by the prohibitive running time for some applications. In the current work, we have designed new heuristics that significantly speed up the tool without compromising its accuracy. Moreover, we have developed a web server (webMGR) that includes elaborate web output to facilitate navigation through the results. webMGR can be accessed via http://www.gis.a-star.edu.sg/~bourque. The source code of the improved standalone version of MGR is also freely available from the web site. Supplementary data are available at Bioinformatics online.

  5. Three gene expression vector sets for concurrently expressing multiple genes in Saccharomyces cerevisiae.

    PubMed

    Ishii, Jun; Kondo, Takashi; Makino, Harumi; Ogura, Akira; Matsuda, Fumio; Kondo, Akihiko

    2014-05-01

    Yeast has the potential to be used in bulk-scale fermentative production of fuels and chemicals due to its tolerance for low pH and robustness for autolysis. However, expression of multiple external genes in one host yeast strain is considerably labor-intensive due to the lack of polycistronic transcription. To promote the metabolic engineering of yeast, we generated systematic and convenient genetic engineering tools to express multiple genes in Saccharomyces cerevisiae. We constructed a series of multi-copy and integration vector sets for concurrently expressing two or three genes in S. cerevisiae by embedding three classical promoters. The comparative expression capabilities of the constructed vectors were monitored with green fluorescent protein, and the concurrent expression of genes was monitored with three different fluorescent proteins. Our multiple gene expression tool will be helpful to the advanced construction of genetically engineered yeast strains in a variety of research fields other than metabolic engineering. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  6. Positive Selection of Plasmodium falciparum Parasites With Multiple var2csa-Type PfEMP1 Genes During the Course of Infection in Pregnant Women

    PubMed Central

    Salanti, Ali; Lavstsen, Thomas; Nielsen, Morten A.; Theander, Thor G.; Leke, Rose G. F.; Lo, Yeung Y.; Bobbili, Naveen; Arnot, David E.; Taylor, Diane W.

    2011-01-01

    Placental malaria infections are caused by Plasmodium falciparum–infected red blood cells sequestering in the placenta by binding to chondroitin sulfate A, mediated by VAR2CSA, a variant of the PfEMP1 family of adhesion antigens. Recent studies have shown that many P. falciparum genomes have multiple genes coding for different VAR2CSA proteins, and parasites with >1 var2csa gene appear to be more common in pregnant women with placental malaria than in nonpregnant individuals. We present evidence that, in pregnant women, parasites containing multiple var2csa-type genes possess a selective advantage over parasites with a single var2csa gene. Accumulation of parasites with multiple copies of the var2csa gene during the course of pregnancy was also correlated with the development of antibodies involved in blocking VAR2CSA adhesion. The data suggest that multiplicity of var2csa-type genes enables P. falciparum parasites to persist for a longer period of time during placental infections, probably because of their greater capacity for antigenic variation and evasion of variant-specific immune responses. PMID:21592998

  7. Y-chromosome lineages in Cabo Verde Islands witness the diverse geographic origin of its first male settlers.

    PubMed

    Gonçalves, Rita; Rosa, Alexandra; Freitas, Ana; Fernandes, Ana; Kivisild, Toomas; Villems, Richard; Brehm, António

    2003-11-01

    The Y-chromosome haplogroup composition of the population of the Cabo Verde Archipelago was profiled by using 32 single-nucleotide polymorphism markers and compared with potential source populations from Iberia, west Africa, and the Middle East. According to the traditional view, the major proportion of the founding population of Cabo Verde was of west African ancestry with the addition of a minor fraction of male colonizers from Europe. Unexpectedly, more than half of the paternal lineages (53.5%) of Cabo Verdeans clustered in haplogroups I, J, K, and R1, which are characteristic of populations of Europe and the Middle East, while being absent in the probable west African source population of Guiné-Bissau. Moreover, a high frequency of J* lineages in Cabo Verdeans relates them more closely to populations of the Middle East and probably provides the first genetic evidence of the legacy of the Jews. In addition, the considerable proportion (20.5%) of E3b(xM81) lineages indicates a possible gene flow from the Middle East or northeast Africa, which, at least partly, could be ascribed to the Sephardic Jews. In contrast to the predominance of west African mitochondrial DNA haplotypes in their maternal gene pool, the major west African Y-chromosome lineage E3a was observed only at a frequency of 15.9%. Overall, these results indicate that gene flow from multiple sources and various sex-specific patterns have been important in the formation of the genomic diversity in the Cabo Verde islands.

  8. Mitochondria, oligodendrocytes and inflammation in bipolar disorder: evidence from transcriptome studies points to intriguing parallels with multiple sclerosis

    PubMed Central

    Konradi, Christine; Sillivan, Stephanie E.; Clay, Hayley B.

    2011-01-01

    Gene expression studies of bipolar disorder (BPD) have shown changes in transcriptome profiles in multiple brain regions. Here we summarize the most consistent findings in the scientific literature, and compare them to data from schizophrenia (SZ) and major depressive disorder (MDD). The transcriptome profiles of all three disorders overlap, making the existence of a BPD-specific profile unlikely. Three groups of functionally related genes are consistently expressed at altered levels in BPD, SZ and MDD. Genes involved in energy metabolism and mitochondrial function are downregulated, genes involved in immune response and inflammation are upregulated, and genes expressed in oligodendrocytes are downregulated. Experimental paradigms for multiple sclerosis demonstrate a tight link between energy metabolism, inflammation and demyelination. These studies also show variabilities in the extent of oligodendrocyte stress, which can vary from a downregulation of oligodendrocyte genes, such as observed in psychiatric disorders, to cell death and brain lesions seen in multiple sclerosis. We conclude that experimental models of multiple sclerosis could be of interest for the research of BPD, SZ and MDD. PMID:21310238

  9. FISH Oracle: a web server for flexible visualization of DNA copy number data in a genomic context.

    PubMed

    Mader, Malte; Simon, Ronald; Steinbiss, Sascha; Kurtz, Stefan

    2011-07-28

    The rapidly growing amount of array CGH data requires improved visualization software supporting the process of identifying candidate cancer genes. Optimally, such software should work across multiple microarray platforms, should be able to cope with data from different sources and should be easy to operate. We have developed a web-based software FISH Oracle to visualize data from multiple array CGH experiments in a genomic context. Its fast visualization engine and advanced web and database technology supports highly interactive use. FISH Oracle comes with a convenient data import mechanism, powerful search options for genomic elements (e.g. gene names or karyobands), quick navigation and zooming into interesting regions, and mechanisms to export the visualization into different high quality formats. These features make the software especially suitable for the needs of life scientists. FISH Oracle offers a fast and easy to use visualization tool for array CGH and SNP array data. It allows for the identification of genomic regions representing minimal common changes based on data from one or more experiments. FISH Oracle will be instrumental to identify candidate onco and tumor suppressor genes based on the frequency and genomic position of DNA copy number changes. The FISH Oracle application and an installed demo web server are available at http://www.zbh.uni-hamburg.de/fishoracle.

  10. FISH Oracle: a web server for flexible visualization of DNA copy number data in a genomic context

    PubMed Central

    2011-01-01

    Background The rapidly growing amount of array CGH data requires improved visualization software supporting the process of identifying candidate cancer genes. Optimally, such software should work across multiple microarray platforms, should be able to cope with data from different sources and should be easy to operate. Results We have developed a web-based software FISH Oracle to visualize data from multiple array CGH experiments in a genomic context. Its fast visualization engine and advanced web and database technology supports highly interactive use. FISH Oracle comes with a convenient data import mechanism, powerful search options for genomic elements (e.g. gene names or karyobands), quick navigation and zooming into interesting regions, and mechanisms to export the visualization into different high quality formats. These features make the software especially suitable for the needs of life scientists. Conclusions FISH Oracle offers a fast and easy to use visualization tool for array CGH and SNP array data. It allows for the identification of genomic regions representing minimal common changes based on data from one or more experiments. FISH Oracle will be instrumental to identify candidate onco and tumor suppressor genes based on the frequency and genomic position of DNA copy number changes. The FISH Oracle application and an installed demo web server are available at http://www.zbh.uni-hamburg.de/fishoracle. PMID:21884636

  11. Harnessing Diversity in Wheat to Enhance Grain Yield, Climate Resilience, Disease and Insect Pest Resistance and Nutrition Through Conventional and Modern Breeding Approaches

    PubMed Central

    Mondal, Suchismita; Rutkoski, Jessica E.; Velu, Govindan; Singh, Pawan K.; Crespo-Herrera, Leonardo A.; Guzmán, Carlos; Bhavani, Sridhar; Lan, Caixia; He, Xinyao; Singh, Ravi P.

    2016-01-01

    Current trends in population growth and consumption patterns continue to increase the demand for wheat, a key cereal for global food security. Further, multiple abiotic challenges due to climate change and evolving pathogen and pests pose a major concern for increasing wheat production globally. Triticeae species comprising of primary, secondary, and tertiary gene pools represent a rich source of genetic diversity in wheat. The conventional breeding strategies of direct hybridization, backcrossing and selection have successfully introgressed a number of desirable traits associated with grain yield, adaptation to abiotic stresses, disease resistance, and bio-fortification of wheat varieties. However, it is time consuming to incorporate genes conferring tolerance/resistance to multiple stresses in a single wheat variety by conventional approaches due to limitations in screening methods and the lower probabilities of combining desirable alleles. Efforts on developing innovative breeding strategies, novel tools and utilizing genetic diversity for new genes/alleles are essential to improve productivity, reduce vulnerability to diseases and pests and enhance nutritional quality. New technologies of high-throughput phenotyping, genome sequencing and genomic selection are promising approaches to maximize progeny screening and selection to accelerate the genetic gains in breeding more productive varieties. Use of cisgenic techniques to transfer beneficial alleles and their combinations within related species also offer great promise especially to achieve durable rust resistance. PMID:27458472

  12. Patterns of evolution at the gametophytic self-incompatibility Sorbus aucuparia (Pyrinae) S pollen genes support the non-self recognition by multiple factors model.

    PubMed

    Aguiar, Bruno; Vieira, Jorge; Cunha, Ana E; Fonseca, Nuno A; Reboiro-Jato, David; Reboiro-Jato, Miguel; Fdez-Riverola, Florentino; Raspé, Olivier; Vieira, Cristina P

    2013-05-01

    S-RNase-based gametophytic self-incompatibility evolved once before the split of the Asteridae and Rosidae. In Prunus (tribe Amygdaloideae of Rosaceae), the self-incompatibility S-pollen is a single F-box gene that presents the expected evolutionary signatures. In Malus and Pyrus (subtribe Pyrinae of Rosaceae), however, clusters of F-box genes (called SFBBs) have been described that are expressed in pollen only and are linked to the S-RNase gene. Although polymorphic, SFBB genes present levels of diversity lower than those of the S-RNase gene. They have been suggested as putative S-pollen genes, in a system of non-self recognition by multiple factors. Subsets of allelic products of the different SFBB genes interact with non-self S-RNases, marking them for degradation, and allowing compatible pollinations. This study performed a detailed characterization of SFBB genes in Sorbus aucuparia (Pyrinae) to address three predictions of the non-self recognition by multiple factors model. As predicted, the number of SFBB genes was large to account for the many S-RNase specificities. Secondly, like the S-RNase gene, the SFBB genes were old. Thirdly, amino acids under positive selection-those that could be involved in specificity determination-were identified when intra-haplotype SFBB genes were analysed using codon models. Overall, the findings reported here support the non-self recognition by multiple factors model.

  13. Multiconstrained gene clustering based on generalized projections

    PubMed Central

    2010-01-01

    Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386

  14. Unity in defence: honeybee workers exhibit conserved molecular responses to diverse pathogens.

    PubMed

    Doublet, Vincent; Poeschl, Yvonne; Gogol-Döring, Andreas; Alaux, Cédric; Annoscia, Desiderato; Aurori, Christian; Barribeau, Seth M; Bedoya-Reina, Oscar C; Brown, Mark J F; Bull, James C; Flenniken, Michelle L; Galbraith, David A; Genersch, Elke; Gisder, Sebastian; Grosse, Ivo; Holt, Holly L; Hultmark, Dan; Lattorff, H Michael G; Le Conte, Yves; Manfredini, Fabio; McMahon, Dino P; Moritz, Robin F A; Nazzi, Francesco; Niño, Elina L; Nowick, Katja; van Rij, Ronald P; Paxton, Robert J; Grozinger, Christina M

    2017-03-02

    Organisms typically face infection by diverse pathogens, and hosts are thought to have developed specific responses to each type of pathogen they encounter. The advent of transcriptomics now makes it possible to test this hypothesis and compare host gene expression responses to multiple pathogens at a genome-wide scale. Here, we performed a meta-analysis of multiple published and new transcriptomes using a newly developed bioinformatics approach that filters genes based on their expression profile across datasets. Thereby, we identified common and unique molecular responses of a model host species, the honey bee (Apis mellifera), to its major pathogens and parasites: the Microsporidia Nosema apis and Nosema ceranae, RNA viruses, and the ectoparasitic mite Varroa destructor, which transmits viruses. We identified a common suite of genes and conserved molecular pathways that respond to all investigated pathogens, a result that suggests a commonality in response mechanisms to diverse pathogens. We found that genes differentially expressed after infection exhibit a higher evolutionary rate than non-differentially expressed genes. Using our new bioinformatics approach, we unveiled additional pathogen-specific responses of honey bees; we found that apoptosis appeared to be an important response following microsporidian infection, while genes from the immune signalling pathways, Toll and Imd, were differentially expressed after Varroa/virus infection. Finally, we applied our bioinformatics approach and generated a gene co-expression network to identify highly connected (hub) genes that may represent important mediators and regulators of anti-pathogen responses. Our meta-analysis generated a comprehensive overview of the host metabolic and other biological processes that mediate interactions between insects and their pathogens. We identified key host genes and pathways that respond to phylogenetically diverse pathogens, representing an important source for future functional studies as well as offering new routes to identify or generate pathogen resilient honey bee stocks. The statistical and bioinformatics approaches that were developed for this study are broadly applicable to synthesize information across transcriptomic datasets. These approaches will likely have utility in addressing a variety of biological questions.

  15. Transcriptional response of Pasteurella multocida to defined iron sources.

    PubMed

    Paustian, Michael L; May, Barbara J; Cao, Dongwei; Boley, Daniel; Kapur, Vivek

    2002-12-01

    Pasteurella multocida was grown in iron-free chemically defined medium supplemented with hemoglobin, transferrin, ferritin, and ferric citrate as iron sources. Whole-genome DNA microarrays were used to monitor global gene expression over seven time points after the addition of the defined iron source to the medium. This resulted in a set of data containing over 338,000 gene expression observations. On average, 12% of P. multocida genes were differentially expressed under any single condition. A majority of these genes encoded P. multocida proteins that were involved in either transport and binding or were annotated as hypothetical proteins. Several trends are evident when the data from different iron sources are compared. In general, only two genes (ptsN and sapD) were expressed at elevated levels under all of the conditions tested. The results also show that genes with increased expression in the presence of hemoglobin did not respond to transferrin or ferritin as an iron source. Correspondingly, genes with increased expression in the transferrin and ferritin experiments were expressed at reduced levels when hemoglobin was supplied as the sole iron source. Finally, the data show that genes that were most responsive to the presence of ferric citrate did not follow a trend similar to that of the other iron sources, suggesting that different pathways respond to inorganic or organic sources of iron in P. multocida. Taken together, our results demonstrate that unique subsets of P. multocida genes are expressed in response to different iron sources and that many of these genes have yet to be functionally characterized.

  16. Amplification of a Gene Related to Mammalian mdr Genes in Drug-Resistant Plasmodium falciparum

    NASA Astrophysics Data System (ADS)

    Wilson, Craig M.; Serrano, Adelfa E.; Wasley, Annemarie; Bogenschutz, Michael P.; Shankar, Anuraj H.; Wirth, Dyann F.

    1989-06-01

    The malaria parasite Plasmodium falciparum contains at least two genes related to the mammalian multiple drug resistance genes, and at least one of the P. falciparum genes is expressed at a higher level and is present in higher copy number in a strain that is resistant to multiple drugs than in a strain that is sensitive to the drugs.

  17. A Convenient Cas9-based Conditional Knockout Strategy for Simultaneously Targeting Multiple Genes in Mouse.

    PubMed

    Chen, Jiang; Du, Yinan; He, Xueyan; Huang, Xingxu; Shi, Yun S

    2017-03-31

    The most powerful way to probe protein function is to characterize the consequence of its deletion. Compared to conventional gene knockout (KO), conditional knockout (cKO) provides an advanced gene targeting strategy with which gene deletion can be performed in a spatially and temporally restricted manner. However, for most species that are amphiploid, the widely used Cre-flox conditional KO (cKO) system would need targeting loci in both alleles to be loxP flanked, which in practice, requires time and labor consuming breeding. This is considerably significant when one is dealing with multiple genes. CRISPR/Cas9 genome modulation system is advantaged in its capability in targeting multiple sites simultaneously. Here we propose a strategy that could achieve conditional KO of multiple genes in mouse with Cre recombinase dependent Cas9 expression. By transgenic construction of loxP-stop-loxP (LSL) controlled Cas9 (LSL-Cas9) together with sgRNAs targeting EGFP, we showed that the fluorescence molecule could be eliminated in a Cre-dependent manner. We further verified the efficacy of this novel strategy to target multiple sites by deleting c-Maf and MafB simultaneously in macrophages specifically. Compared to the traditional Cre-flox cKO strategy, this sgRNAs-LSL-Cas9 cKO system is simpler and faster, and would make conditional manipulation of multiple genes feasible.

  18. Comparative genomic and plasmid analysis of beer-spoiling and non-beer-spoiling Lactobacillus brevis isolates.

    PubMed

    Bergsveinson, Jordyn; Ziola, Barry

    2017-12-01

    Beer-spoilage-related lactic acid bacteria (BSR LAB) belong to multiple genera and species; however, beer-spoilage capacity is isolate-specific and partially acquired via horizontal gene transfer within the brewing environment. Thus, the extent to which genus-, species-, or environment- (i.e., brewery-) level genetic variability influences beer-spoilage phenotype is unknown. Publicly available Lactobacillus brevis genomes were analyzed via BlAst Diagnostic Gene findEr (BADGE) for BSR genes and assessed for pangenomic relationships. Also analyzed were functional coding capacities of plasmids of LAB inhabiting extreme niche environments. Considerable genetic variation was observed in L. brevis isolated from clinical samples, whereas 16 candidate genes distinguish BSR and non-BSR L. brevis genomes. These genes are related to nutrient scavenging of gluconate or pentoses, mannose, and metabolism of pectin. BSR L. brevis isolates also have higher average nucleotide identity and stronger pangenome association with one another, though isolation source (i.e., specific brewery) also appears to influence the plasmid coding capacity of BSR LAB. Finally, it is shown that niche-specific adaptation and phenotype are plasmid-encoded for both BSR and non-BSR LAB. The ultimate combination of plasmid-encoded genes dictates the ability of L. brevis to survive in the most extreme beer environment, namely, gassed (i.e., pressurized) beer.

  19. Stochastic models for inferring genetic regulation from microarray gene expression data.

    PubMed

    Tian, Tianhai

    2010-03-01

    Microarray expression profiles are inherently noisy and many different sources of variation exist in microarray experiments. It is still a significant challenge to develop stochastic models to realize noise in microarray expression profiles, which has profound influence on the reverse engineering of genetic regulation. Using the target genes of the tumour suppressor gene p53 as the test problem, we developed stochastic differential equation models and established the relationship between the noise strength of stochastic models and parameters of an error model for describing the distribution of the microarray measurements. Numerical results indicate that the simulated variance from stochastic models with a stochastic degradation process can be represented by a monomial in terms of the hybridization intensity and the order of the monomial depends on the type of stochastic process. The developed stochastic models with multiple stochastic processes generated simulations whose variance is consistent with the prediction of the error model. This work also established a general method to develop stochastic models from experimental information. 2009 Elsevier Ireland Ltd. All rights reserved.

  20. Metatranscriptomic analysis of a high-sulfide aquatic spring reveals insights into sulfur cycling and unexpected aerobic metabolism

    PubMed Central

    Elshahed, Mostafa S.; Najar, Fares Z.; Krumholz, Lee R.

    2015-01-01

    Zodletone spring is a sulfide-rich spring in southwestern Oklahoma characterized by shallow, microoxic, light-exposed spring water overlaying anoxic sediments. Previously, culture-independent 16S rRNA gene based diversity surveys have revealed that Zodletone spring source sediments harbor a highly diverse microbial community, with multiple lineages putatively involved in various sulfur-cycling processes. Here, we conducted a metatranscriptomic survey of microbial populations in Zodletone spring source sediments to characterize the relative prevalence and importance of putative phototrophic, chemolithotrophic, and heterotrophic microorganisms in the sulfur cycle, the identity of lineages actively involved in various sulfur cycling processes, and the interaction between sulfur cycling and other geochemical processes at the spring source. Sediment samples at the spring’s source were taken at three different times within a 24-h period for geochemical analyses and RNA sequencing. In depth mining of datasets for sulfur cycling transcripts revealed major sulfur cycling pathways and taxa involved, including an unexpected potential role of Actinobacteria in sulfide oxidation and thiosulfate transformation. Surprisingly, transcripts coding for the cyanobacterial Photosystem II D1 protein, methane monooxygenase, and terminal cytochrome oxidases were encountered, indicating that genes for oxygen production and aerobic modes of metabolism are actively being transcribed, despite below-detectable levels (<1 µM) of oxygen in source sediment. Results highlight transcripts involved in sulfur, methane, and oxygen cycles, propose that oxygenic photosynthesis could support aerobic methane and sulfide oxidation in anoxic sediments exposed to sunlight, and provide a viewpoint of microbial metabolic lifestyles under conditions similar to those seen during late Archaean and Proterozoic eons. PMID:26417542

  1. Metatranscriptomic analysis of a high-sulfide aquatic spring reveals insights into sulfur cycling and unexpected aerobic metabolism.

    PubMed

    Spain, Anne M; Elshahed, Mostafa S; Najar, Fares Z; Krumholz, Lee R

    2015-01-01

    Zodletone spring is a sulfide-rich spring in southwestern Oklahoma characterized by shallow, microoxic, light-exposed spring water overlaying anoxic sediments. Previously, culture-independent 16S rRNA gene based diversity surveys have revealed that Zodletone spring source sediments harbor a highly diverse microbial community, with multiple lineages putatively involved in various sulfur-cycling processes. Here, we conducted a metatranscriptomic survey of microbial populations in Zodletone spring source sediments to characterize the relative prevalence and importance of putative phototrophic, chemolithotrophic, and heterotrophic microorganisms in the sulfur cycle, the identity of lineages actively involved in various sulfur cycling processes, and the interaction between sulfur cycling and other geochemical processes at the spring source. Sediment samples at the spring's source were taken at three different times within a 24-h period for geochemical analyses and RNA sequencing. In depth mining of datasets for sulfur cycling transcripts revealed major sulfur cycling pathways and taxa involved, including an unexpected potential role of Actinobacteria in sulfide oxidation and thiosulfate transformation. Surprisingly, transcripts coding for the cyanobacterial Photosystem II D1 protein, methane monooxygenase, and terminal cytochrome oxidases were encountered, indicating that genes for oxygen production and aerobic modes of metabolism are actively being transcribed, despite below-detectable levels (<1 µM) of oxygen in source sediment. Results highlight transcripts involved in sulfur, methane, and oxygen cycles, propose that oxygenic photosynthesis could support aerobic methane and sulfide oxidation in anoxic sediments exposed to sunlight, and provide a viewpoint of microbial metabolic lifestyles under conditions similar to those seen during late Archaean and Proterozoic eons.

  2. Gene Presence-Absence Polymorphism in Castrating Anther-Smut Fungi: Recent Gene Gains and Phylogeographic Structure.

    PubMed

    Hartmann, Fanny E; Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana

    2018-04-01

    Gene presence-absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence-absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence-absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence-absence polymorphism in the two species. Genes displaying presence-absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence-absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence-absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies.

  3. Gene Presence–Absence Polymorphism in Castrating Anther-Smut Fungi: Recent Gene Gains and Phylogeographic Structure

    PubMed Central

    Rodríguez de la Vega, Ricardo C; Brandenburg, Jean-Tristan; Carpentier, Fantin; Giraud, Tatiana

    2018-01-01

    Abstract Gene presence–absence polymorphisms segregating within species are a significant source of genetic variation but have been little investigated to date in natural populations. In plant pathogens, the gain or loss of genes encoding proteins interacting directly with the host, such as secreted proteins, probably plays an important role in coevolution and local adaptation. We investigated gene presence–absence polymorphism in populations of two closely related species of castrating anther-smut fungi, Microbotryum lychnidis-dioicae (MvSl) and M. silenes-dioicae (MvSd), from across Europe, on the basis of Illumina genome sequencing data and high-quality genome references. We observed presence–absence polymorphism for 186 autosomal genes (2% of all genes) in MvSl, and only 51 autosomal genes in MvSd. Distinct genes displayed presence–absence polymorphism in the two species. Genes displaying presence–absence polymorphism were frequently located in subtelomeric and centromeric regions and close to repetitive elements, and comparison with outgroups indicated that most were present in a single species, being recently acquired through duplications in multiple-gene families. Gene presence–absence polymorphism in MvSl showed a phylogeographic structure corresponding to clusters detected based on SNPs. In addition, gene absence alleles were rare within species and skewed toward low-frequency variants. These findings are consistent with a deleterious or neutral effect for most gene presence–absence polymorphism. Some of the observed gene loss and gain events may however be adaptive, as suggested by the putative functions of the corresponding encoded proteins (e.g., secreted proteins) or their localization within previously identified selective sweeps. The adaptive roles in plant and anther-smut fungi interactions of candidate genes however need to be experimentally tested in future studies. PMID:29722826

  4. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP) miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM) v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets. Results BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf) results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p<0.05) gene targets in BRM indicates that nicotine exposure disrupts genes involved in neurogenesis, possibly through misregulation of nicotine-sensitive miRNAs. Conclusions BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis generation tool for systems biology. The miRNA workflow in BRM allows for efficient processing of multiple miRNA and mRNA datasets in a single software environment with the added capability to interact with public data sources and visual analytic tools for HTP data analysis at a systems level. BRM is developed using Java™ and other open-source technologies for free distribution (http://www.sysbio.org/dataresources/brm.stm). PMID:23174015

  5. MetaSeq: privacy preserving meta-analysis of sequencing-based association studies.

    PubMed

    Singh, Angad Pal; Zafer, Samreen; Pe'er, Itsik

    2013-01-01

    Human genetics recently transitioned from GWAS to studies based on NGS data. For GWAS, small effects dictated large sample sizes, typically made possible through meta-analysis by exchanging summary statistics across consortia. NGS studies groupwise-test for association of multiple potentially-causal alleles along each gene. They are subject to similar power constraints and therefore likely to resort to meta-analysis as well. The problem arises when considering privacy of the genetic information during the data-exchange process. Many scoring schemes for NGS association rely on the frequency of each variant thus requiring the exchange of identity of the sequenced variant. As such variants are often rare, potentially revealing the identity of their carriers and jeopardizing privacy. We have thus developed MetaSeq, a protocol for meta-analysis of genome-wide sequencing data by multiple collaborating parties, scoring association for rare variants pooled per gene across all parties. We tackle the challenge of tallying frequency counts of rare, sequenced alleles, for metaanalysis of sequencing data without disclosing the allele identity and counts, thereby protecting sample identity. This apparent paradoxical exchange of information is achieved through cryptographic means. The key idea is that parties encrypt identity of genes and variants. When they transfer information about frequency counts in cases and controls, the exchanged data does not convey the identity of a mutation and therefore does not expose carrier identity. The exchange relies on a 3rd party, trusted to follow the protocol although not trusted to learn about the raw data. We show applicability of this method to publicly available exome-sequencing data from multiple studies, simulating phenotypic information for powerful meta-analysis. The MetaSeq software is publicly available as open source.

  6. Human genetics of infectious diseases: a unified theory

    PubMed Central

    Casanova, Jean-Laurent; Abel, Laurent

    2007-01-01

    Since the early 1950s, the dominant paradigm in the human genetics of infectious diseases postulates that rare monogenic immunodeficiencies confer vulnerability to multiple infectious diseases (one gene, multiple infections), whereas common infections are associated with the polygenic inheritance of multiple susceptibility genes (one infection, multiple genes). Recent studies, since 1996 in particular, have challenged this view. A newly recognised group of primary immunodeficiencies predisposing the individual to a principal or single type of infection is emerging. In parallel, several common infections have been shown to reflect the inheritance of one major susceptibility gene, at least in some populations. This novel causal relationship (one gene, one infection) blurs the distinction between patient-based Mendelian genetics and population-based complex genetics, and provides a unified conceptual frame for exploring the molecular genetic basis of infectious diseases in humans. PMID:17255931

  7. Anti-inflammatory genes associated with multiple sclerosis: a gene expression study.

    PubMed

    Perga, S; Montarolo, F; Martire, S; Berchialla, P; Malucchi, S; Bertolotto, A

    2015-02-15

    Multiple sclerosis (MS) is an autoimmune inflammatory disease of the central nervous system caused by a complex interaction between multiple genes and environmental factors. HLA region is the strongest susceptibility locus, but recent huge genome-wide association studies identified new susceptibility genes. Among these, BACH2, PTGER4, RGS1 and ZFP36L1 were highlighted. Here, a gene expression analysis revealed that three of them, namely BACH2, PTGER4 and ZFP36L1, are down-regulated in MS patients' blood cells compared to healthy subjects. Interestingly, all these genes are involved in the immune system regulation with predominant anti-inflammatory role and their reduction could predispose to MS development. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. L-glutamine Induces Expression of Listeria monocytogenes Virulence Genes

    PubMed Central

    Lobel, Lior; Burg-Golani, Tamar; Sigal, Nadejda; Rose, Jessica; Livnat-Levanon, Nurit; Lewinson, Oded; Herskovits, Anat A.

    2017-01-01

    The high environmental adaptability of bacteria is contingent upon their ability to sense changes in their surroundings. Bacterial pathogen entry into host poses an abrupt and dramatic environmental change, during which successful pathogens gauge multiple parameters that signal host localization. The facultative human pathogen Listeria monocytogenes flourishes in soil, water and food, and in ~50 different animals, and serves as a model for intracellular infection. L. monocytogenes identifies host entry by sensing both physical (e.g., temperature) and chemical (e.g., metabolite concentrations) factors. We report here that L-glutamine, an abundant nitrogen source in host serum and cells, serves as an environmental indicator and inducer of virulence gene expression. In contrast, ammonia, which is the most abundant nitrogen source in soil and water, fully supports growth, but fails to activate virulence gene transcription. We demonstrate that induction of virulence genes only occurs when the Listerial intracellular concentration of L-glutamine crosses a certain threshold, acting as an on/off switch: off when L-glutamine concentrations are below the threshold, and fully on when the threshold is crossed. To turn on the switch, L-glutamine must be present, and the L-glutamine high affinity ABC transporter, GlnPQ, must be active. Inactivation of GlnPQ led to complete arrest of L-glutamine uptake, reduced type I interferon response in infected macrophages, dramatic reduction in expression of virulence genes, and attenuated virulence in a mouse infection model. These results may explain observations made with other pathogens correlating nitrogen metabolism and virulence, and suggest that gauging of L-glutamine as a means of ascertaining host localization may be a general mechanism. PMID:28114430

  9. One-step generation of complete gene knockout mice and monkeys by CRISPR/Cas9-mediated gene editing with multiple sgRNAs.

    PubMed

    Zuo, Erwei; Cai, Yi-Jun; Li, Kui; Wei, Yu; Wang, Bang-An; Sun, Yidi; Liu, Zhen; Liu, Jiwei; Hu, Xinde; Wei, Wei; Huo, Xiaona; Shi, Linyu; Tang, Cheng; Liang, Dan; Wang, Yan; Nie, Yan-Hong; Zhang, Chen-Chen; Yao, Xuan; Wang, Xing; Zhou, Changyang; Ying, Wenqin; Wang, Qifang; Chen, Ren-Chao; Shen, Qi; Xu, Guo-Liang; Li, Jinsong; Sun, Qiang; Xiong, Zhi-Qi; Yang, Hui

    2017-07-01

    The CRISPR/Cas9 system is an efficient gene-editing method, but the majority of gene-edited animals showed mosaicism, with editing occurring only in a portion of cells. Here we show that single gene or multiple genes can be completely knocked out in mouse and monkey embryos by zygotic injection of Cas9 mRNA and multiple adjacent single-guide RNAs (spaced 10-200 bp apart) that target only a single key exon of each gene. Phenotypic analysis of F0 mice following targeted deletion of eight genes on the Y chromosome individually demonstrated the robustness of this approach in generating knockout mice. Importantly, this approach delivers complete gene knockout at high efficiencies (100% on Arntl and 91% on Prrt2) in monkey embryos. Finally, we could generate a complete Prrt2 knockout monkey in a single step, demonstrating the usefulness of this approach in rapidly establishing gene-edited monkey models.

  10. Detection of multiple perturbations in multi-omics biological networks.

    PubMed

    Griffin, Paula J; Zhang, Yuqing; Johnson, William Evan; Kolaczyk, Eric D

    2018-05-17

    Cellular mechanism-of-action is of fundamental concern in many biological studies. It is of particular interest for identifying the cause of disease and learning the way in which treatments act against disease. However, pinpointing such mechanisms is difficult, due to the fact that small perturbations to the cell can have wide-ranging downstream effects. Given a snapshot of cellular activity, it can be challenging to tell where a disturbance originated. The presence of an ever-greater variety of high-throughput biological data offers an opportunity to examine cellular behavior from multiple angles, but also presents the statistical challenge of how to effectively analyze data from multiple sources. In this setting, we propose a method for mechanism-of-action inference by extending network filtering to multi-attribute data. We first estimate a joint Gaussian graphical model across multiple data types using penalized regression and filter for network effects. We then apply a set of likelihood ratio tests to identify the most likely site of the original perturbation. In addition, we propose a conditional testing procedure to allow for detection of multiple perturbations. We demonstrate this methodology on paired gene expression and methylation data from The Cancer Genome Atlas (TCGA). © 2018, The International Biometric Society.

  11. Automated Comparative Auditing of NCIT Genomic Roles Using NCBI

    PubMed Central

    Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

    2008-01-01

    Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT’s Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information’s (NCBI’s) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes plays a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance. PMID:18486558

  12. Toolbox Approaches Using Molecular Markers and 16S rRNA Gene Amplicon Data Sets for Identification of Fecal Pollution in Surface Water.

    PubMed

    Ahmed, W; Staley, C; Sadowsky, M J; Gyawali, P; Sidhu, J P S; Palmer, A; Beale, D J; Toze, S

    2015-10-01

    In this study, host-associated molecular markers and bacterial 16S rRNA gene community analysis using high-throughput sequencing were used to identify the sources of fecal pollution in environmental waters in Brisbane, Australia. A total of 92 fecal and composite wastewater samples were collected from different host groups (cat, cattle, dog, horse, human, and kangaroo), and 18 water samples were collected from six sites (BR1 to BR6) along the Brisbane River in Queensland, Australia. Bacterial communities in the fecal, wastewater, and river water samples were sequenced. Water samples were also tested for the presence of bird-associated (GFD), cattle-associated (CowM3), horse-associated, and human-associated (HF183) molecular markers, to provide multiple lines of evidence regarding the possible presence of fecal pollution associated with specific hosts. Among the 18 water samples tested, 83%, 33%, 17%, and 17% were real-time PCR positive for the GFD, HF183, CowM3, and horse markers, respectively. Among the potential sources of fecal pollution in water samples from the river, DNA sequencing tended to show relatively small contributions from wastewater treatment plants (up to 13% of sequence reads). Contributions from other animal sources were rarely detected and were very small (<3% of sequence reads). Source contributions determined via sequence analysis versus detection of molecular markers showed variable agreement. A lack of relationships among fecal indicator bacteria, host-associated molecular markers, and 16S rRNA gene community analysis data was also observed. Nonetheless, we show that bacterial community and host-associated molecular marker analyses can be combined to identify potential sources of fecal pollution in an urban river. This study is a proof of concept, and based on the results, we recommend using bacterial community analysis (where possible) along with PCR detection or quantification of host-associated molecular markers to provide information on the sources of fecal pollution in waterways. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  13. paraGSEA: a scalable approach for large-scale gene expression profiling

    PubMed Central

    Peng, Shaoliang; Yang, Shunyun

    2017-01-01

    Abstract More studies have been conducted using gene expression similarity to identify functional connections among genes, diseases and drugs. Gene Set Enrichment Analysis (GSEA) is a powerful analytical method for interpreting gene expression data. However, due to its enormous computational overhead in the estimation of significance level step and multiple hypothesis testing step, the computation scalability and efficiency are poor on large-scale datasets. We proposed paraGSEA for efficient large-scale transcriptome data analysis. By optimization, the overall time complexity of paraGSEA is reduced from O(mn) to O(m+n), where m is the length of the gene sets and n is the length of the gene expression profiles, which contributes more than 100-fold increase in performance compared with other popular GSEA implementations such as GSEA-P, SAM-GS and GSEA2. By further parallelization, a near-linear speed-up is gained on both workstations and clusters in an efficient manner with high scalability and performance on large-scale datasets. The analysis time of whole LINCS phase I dataset (GSE92742) was reduced to nearly half hour on a 1000 node cluster on Tianhe-2, or within 120 hours on a 96-core workstation. The source code of paraGSEA is licensed under the GPLv3 and available at http://github.com/ysycloud/paraGSEA. PMID:28973463

  14. SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees

    PubMed Central

    Mallo, Diego; De Oliveira Martins, Leonardo; Posada, David

    2016-01-01

    We present a fast and flexible software package—SimPhy—for the simulation of multiple gene families evolving under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer—all three potentially leading to species tree/gene tree discordance—and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus, and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy also incorporates comprehensive models of substitution rate variation among lineages (uncorrelated relaxed clocks) and the capability of simulating partitioned nucleotide, codon, and protein multilocus sequence alignments under a plethora of substitution models using the program INDELible. We validate SimPhy's output using theoretical expectations and other programs, and show that it scales extremely well with complex models and/or large trees, being an order of magnitude faster than the most similar program (DLCoal-Sim). In addition, we demonstrate how SimPhy can be useful to understand interactions among different evolutionary processes, conducting a simulation study to characterize the systematic overestimation of the duplication time when using standard reconciliation methods. SimPhy is available at https://github.com/adamallo/SimPhy, where users can find the source code, precompiled executables, a detailed manual and example cases. PMID:26526427

  15. Screening of differentially expressed genes between multiple trauma patients with and without sepsis.

    PubMed

    Ji, S C; Pan, Y T; Lu, Q Y; Sun, Z Y; Liu, Y Z

    2014-03-17

    The purpose of this study was to identify critical genes associated with septic multiple trauma by comparing peripheral whole blood samples from multiple trauma patients with and without sepsis. A microarray data set was downloaded from the Gene Expression Omnibus (GEO) database. This data set included 70 samples, 36 from multiple trauma patients with sepsis and 34 from multiple trauma patients without sepsis (as a control set). The data were preprocessed, and differentially expressed genes (DEGs) were then screened for using packages of the R language. Functional analysis of DEGs was performed with DAVID. Interaction networks were then established for the most up- and down-regulated genes using HitPredict. Pathway-enrichment analysis was conducted for genes in the networks using WebGestalt. Fifty-eight DEGs were identified. The expression levels of PLAU (down-regulated) and MMP8 (up-regulated) presented the largest fold-changes, and interaction networks were established for these genes. Further analysis revealed that PLAT (plasminogen activator, tissue) and SERPINF2 (serpin peptidase inhibitor, clade F, member 2), which interact with PLAU, play important roles in the pathway of the component and coagulation cascade. We hypothesize that PLAU is a major regulator of the component and coagulation cascade, and down-regulation of PLAU results in dysfunction of the pathway, causing sepsis.

  16. Patterns of evolution at the gametophytic self-incompatibility Sorbus aucuparia (Pyrinae) S pollen genes support the non-self recognition by multiple factors model

    PubMed Central

    Aguiar, Bruno; Vieira, Jorge; Cunha, Ana E.; Fonseca, Nuno A.; Reboiro-Jato, David; Reboiro-Jato, Miguel; Fdez-Riverola, Florentino; Raspé, Olivier; Vieira, Cristina P.

    2013-01-01

    S-RNase-based gametophytic self-incompatibility evolved once before the split of the Asteridae and Rosidae. In Prunus (tribe Amygdaloideae of Rosaceae), the self-incompatibility S-pollen is a single F-box gene that presents the expected evolutionary signatures. In Malus and Pyrus (subtribe Pyrinae of Rosaceae), however, clusters of F-box genes (called SFBBs) have been described that are expressed in pollen only and are linked to the S-RNase gene. Although polymorphic, SFBB genes present levels of diversity lower than those of the S-RNase gene. They have been suggested as putative S-pollen genes, in a system of non-self recognition by multiple factors. Subsets of allelic products of the different SFBB genes interact with non-self S-RNases, marking them for degradation, and allowing compatible pollinations. This study performed a detailed characterization of SFBB genes in Sorbus aucuparia (Pyrinae) to address three predictions of the non-self recognition by multiple factors model. As predicted, the number of SFBB genes was large to account for the many S-RNase specificities. Secondly, like the S-RNase gene, the SFBB genes were old. Thirdly, amino acids under positive selection—those that could be involved in specificity determination—were identified when intra-haplotype SFBB genes were analysed using codon models. Overall, the findings reported here support the non-self recognition by multiple factors model. PMID:23606363

  17. A multicolor panel of TALE-KRAB based transcriptional repressor vectors enabling knockdown of multiple gene targets

    PubMed Central

    Zhang, Zhonghui; Wu, Elise; Qian, Zhijian; Wu, Wen-Shu

    2014-01-01

    Stable and efficient knockdown of multiple gene targets is highly desirable for dissection of molecular pathways. Because it allows sequence-specific DNA binding, transcription activator-like effector (TALE) offers a new genetic perturbation technique that allows for gene-specific repression. Here, we constructed a multicolor lentiviral TALE-Kruppel-associated box (KRAB) expression vector platform that enables knockdown of multiple gene targets. This platform is fully compatible with the Golden Gate TALEN and TAL Effector Kit 2.0, a widely used and efficient method for TALE assembly. We showed that this multicolor TALE-KRAB vector system when combined together with bone marrow transplantation could quickly knock down c-kit and PU.1 genes in hematopoietic stem and progenitor cells of recipient mice. Furthermore, our data demonstrated that this platform simultaneously knocked down both c-Kit and PU.1 genes in the same primary cell populations. Together, our results suggest that this multicolor TALE-KRAB vector platform is a promising and versatile tool for knockdown of multiple gene targets and could greatly facilitate dissection of molecular pathways. PMID:25475013

  18. A multicolor panel of TALE-KRAB based transcriptional repressor vectors enabling knockdown of multiple gene targets.

    PubMed

    Zhang, Zhonghui; Wu, Elise; Qian, Zhijian; Wu, Wen-Shu

    2014-12-05

    Stable and efficient knockdown of multiple gene targets is highly desirable for dissection of molecular pathways. Because it allows sequence-specific DNA binding, transcription activator-like effector (TALE) offers a new genetic perturbation technique that allows for gene-specific repression. Here, we constructed a multicolor lentiviral TALE-Kruppel-associated box (KRAB) expression vector platform that enables knockdown of multiple gene targets. This platform is fully compatible with the Golden Gate TALEN and TAL Effector Kit 2.0, a widely used and efficient method for TALE assembly. We showed that this multicolor TALE-KRAB vector system when combined together with bone marrow transplantation could quickly knock down c-kit and PU.1 genes in hematopoietic stem and progenitor cells of recipient mice. Furthermore, our data demonstrated that this platform simultaneously knocked down both c-Kit and PU.1 genes in the same primary cell populations. Together, our results suggest that this multicolor TALE-KRAB vector platform is a promising and versatile tool for knockdown of multiple gene targets and could greatly facilitate dissection of molecular pathways.

  19. Effects of multiple founder populations on spatial genetic structure of reintroduced American martens.

    PubMed

    Williams, Bronwyn W; Scribner, Kim T

    2010-01-01

    Reintroductions and translocations are increasingly used to repatriate or increase probabilities of persistence for animal and plant species. Genetic and demographic characteristics of founding individuals and suitability of habitat at release sites are commonly believed to affect the success of these conservation programs. Genetic divergence among multiple source populations of American martens (Martes americana) and well documented introduction histories permitted analyses of post-introduction dispersion from release sites and development of genetic clusters in the Upper Peninsula (UP) of Michigan <50 years following release. Location and size of spatial genetic clusters and measures of individual-based autocorrelation were inferred using 11 microsatellite loci. We identified three genetic clusters in geographic proximity to original release locations. Estimated distances of effective gene flow based on spatial autocorrelation varied greatly among genetic clusters (30-90 km). Spatial contiguity of genetic clusters has been largely maintained with evidence for admixture primarily in localized regions, suggesting recent contact or locally retarded rates of gene flow. Data provide guidance for future studies of the effects of permeabilities of different land-cover and land-use features to dispersal and of other biotic and environmental factors that may contribute to the colonization process and development of spatial genetic associations.

  20. A reproducible approach to high-throughput biological data acquisition and integration

    PubMed Central

    Rahnavard, Gholamali; Waldron, Levi; McIver, Lauren; Shafquat, Afrah; Franzosa, Eric A.; Miropolsky, Larissa; Sweeney, Christopher

    2015-01-01

    Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming and error-prone. To enable efficient and reproducible integration of diverse experimental results, we developed a novel approach for standardized acquisition and analysis of high-throughput and heterogeneous biological data. This allowed, first, novel biomolecular network reconstruction in human prostate cancer, which correctly recovered and extended the NFκB signaling pathway. Next, we investigated host-microbiome interactions. In less than an hour of analysis time, the system retrieved data and integrated six germ-free murine intestinal gene expression datasets to identify the genes most influenced by the gut microbiota, which comprised a set of immune-response and carbohydrate metabolism processes. Finally, we constructed integrated functional interaction networks to compare connectivity of peptide secretion pathways in the model organisms Escherichia coli, Bacillus subtilis, and Pseudomonas aeruginosa. PMID:26157642

  1. EBIC: an evolutionary-based parallel biclustering algorithm for pattern discovery.

    PubMed

    Orzechowski, Patryk; Sipper, Moshe; Huang, Xiuzhen; Moore, Jason H

    2018-05-22

    Biclustering algorithms are commonly used for gene expression data analysis. However, accurate identification of meaningful structures is very challenging and state-of-the-art methods are incapable of discovering with high accuracy different patterns of high biological relevance. In this paper a novel biclustering algorithm based on evolutionary computation, a subfield of artificial intelligence (AI), is introduced. The method called EBIC aims to detect order-preserving patterns in complex data. EBIC is capable of discovering multiple complex patterns with unprecedented accuracy in real gene expression datasets. It is also one of the very few biclustering methods designed for parallel environments with multiple graphics processing units (GPUs). We demonstrate that EBIC greatly outperforms state-of-the-art biclustering methods, in terms of recovery and relevance, on both synthetic and genetic datasets. EBIC also yields results over 12 times faster than the most accurate reference algorithms. EBIC source code is available on GitHub at https://github.com/EpistasisLab/ebic. Correspondence and requests for materials should be addressed to P.O. (email: patryk.orzechowski@gmail.com) and J.H.M. (email: jhmoore@upenn.edu). Supplementary Data with results of analyses and additional information on the method is available at Bioinformatics online.

  2. The maize (Zea mays ssp. mays var. B73) genome encodes 33 members of the purple acid phosphatase family

    PubMed Central

    González-Muñoz, Eliécer; Avendaño-Vázquez, Aida-Odette; Montes, Ricardo A. Chávez; de Folter, Stefan; Andrés-Hernández, Liliana; Abreu-Goodger, Cei; Sawers, Ruairidh J. H.

    2015-01-01

    Purple acid phosphatases (PAPs) play an important role in plant phosphorus nutrition, both by liberating phosphorus from organic sources in the soil and by modulating distribution within the plant throughout growth and development. Furthermore, members of the PAP protein family have been implicated in a broader role in plant mineral homeostasis, stress responses and development. We have identified 33 candidate PAP encoding gene models in the maize (Zea mays ssp. mays var. B73) reference genome. The maize Pap family includes a clear single-copy ortholog of the Arabidopsis gene AtPAP26, shown previously to encode both major intracellular and secreted acid phosphatase activities. Certain groups of PAPs present in Arabidopsis, however, are absent in maize, while the maize family contains a number of expansions, including a distinct radiation not present in Arabidopsis. Analysis of RNA-sequencing based transcriptome data revealed accumulation of maize Pap transcripts in multiple plant tissues at multiple stages of development, and increased accumulation of specific transcripts under low phosphorus availability. These data suggest the maize PAP family as a whole to have broad significance throughout the plant life cycle, while highlighting potential functional specialization of individual family members. PMID:26042133

  3. BIG: a large-scale data integration tool for renal physiology.

    PubMed

    Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A

    2016-10-01

    Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.

  4. Comprehensive Evaluation of the Contribution of X Chromosome Genes to Platinum Sensitivity

    PubMed Central

    Gamazon, Eric R.; Im, Hae Kyung; O’Donnell, Peter H.; Ziliak, Dana; Stark, Amy L.; Cox, Nancy J.; Dolan, M. Eileen; Huang, Rong Stephanie

    2011-01-01

    Utilizing a genome-wide gene expression dataset generated from Affymetrix GeneChip® Human Exon 1.0ST array, we comprehensively surveyed the role of 322 X chromosome gene expression traits on cellular sensitivity to cisplatin and carboplatin. We identified 31 and 17 X chromosome genes whose expression levels are significantly correlated (after multiple testing correction) with sensitivity to carboplatin and cisplatin, respectively, in the combined HapMap CEU and YRI populations (false discovery rate, FDR<0.05). Of those, 14 overlap for both cisplatin and carboplatin. Employing an independent gene expression quantification method, the Illumina Sentrix Human-6 Expression BeadChip, measured on the same HapMap cell lines, we found that 4 and 2 of these genes are significantly associated with carboplatin and cisplatin sensitivity respectively in both analyses. Two genes, CTPS2 and DLG3, were identified by both genome-wide gene expression analyses as correlated with cellular sensitivity to both platinating agents. The expression of DLG3 gene was also found to correlate with cellular sensitivity to platinating agents in NCI60 cancer cell lines. In addition, we evaluated the role of X chromosome gene expression to the observed differences in sensitivity to the platinums between CEU and YRI derived cell lines. Of the 34 distinct genes significantly correlated with either carboplatin or cisplatin sensitivity, 14 are differentially expressed (defined as p<0.05) between CEU and YRI. Thus, sex chromosome genes play a role in cellular sensitivity to platinating agents and differences in the expression level of these genes are an important source of variation that should be included in comprehensive pharmacogenomic studies. PMID:21252287

  5. In vivo simultaneous transcriptional activation of multiple genes in the brain using CRISPR-dCas9-activator transgenic mice.

    PubMed

    Zhou, Haibo; Liu, Junlai; Zhou, Changyang; Gao, Ni; Rao, Zhiping; Li, He; Hu, Xinde; Li, Changlin; Yao, Xuan; Shen, Xiaowen; Sun, Yidi; Wei, Yu; Liu, Fei; Ying, Wenqin; Zhang, Junming; Tang, Cheng; Zhang, Xu; Xu, Huatai; Shi, Linyu; Cheng, Leping; Huang, Pengyu; Yang, Hui

    2018-03-01

    Despite rapid progresses in the genome-editing field, in vivo simultaneous overexpression of multiple genes remains challenging. We generated a transgenic mouse using an improved dCas9 system that enables simultaneous and precise in vivo transcriptional activation of multiple genes and long noncoding RNAs in the nervous system. As proof of concept, we were able to use targeted activation of endogenous neurogenic genes in these transgenic mice to directly and efficiently convert astrocytes into functional neurons in vivo. This system provides a flexible and rapid screening platform for studying complex gene networks and gain-of-function phenotypes in the mammalian brain.

  6. A group LASSO-based method for robustly inferring gene regulatory networks from multiple time-course datasets.

    PubMed

    Liu, Li-Zhi; Wu, Fang-Xiang; Zhang, Wen-Jun

    2014-01-01

    As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.

  7. Genes with a spike expression are clustered in chromosome (sub)bands and spike (sub)bands have a powerful prognostic value in patients with multiple myeloma

    PubMed Central

    Kassambara, Alboukadel; Hose, Dirk; Moreaux, Jérôme; Walker, Brian A.; Protopopov, Alexei; Reme, Thierry; Pellestor, Franck; Pantesco, Véronique; Jauch, Anna; Morgan, Gareth; Goldschmidt, Hartmut; Klein, Bernard

    2012-01-01

    Background Genetic abnormalities are common in patients with multiple myeloma, and may deregulate gene products involved in tumor survival, proliferation, metabolism and drug resistance. In particular, translocations may result in a high expression of targeted genes (termed spike expression) in tumor cells. We identified spike genes in multiple myeloma cells of patients with newly-diagnosed myeloma and investigated their prognostic value. Design and Methods Genes with a spike expression in multiple myeloma cells were picked up using box plot probe set signal distribution and two selection filters. Results In a cohort of 206 newly diagnosed patients with multiple myeloma, 2587 genes/expressed sequence tags with a spike expression were identified. Some spike genes were associated with some transcription factors such as MAF or MMSET and with known recurrent translocations as expected. Spike genes were not associated with increased DNA copy number and for a majority of them, involved unknown mechanisms. Of spiked genes, 36.7% clustered significantly in 149 out of 862 documented chromosome (sub)bands, of which 53 had prognostic value (35 bad, 18 good). Their prognostic value was summarized with a spike band score that delineated 23.8% of patients with a poor median overall survival (27.4 months versus not reached, P<0.001) using the training cohort of 206 patients. The spike band score was independent of other gene expression profiling-based risk scores, t(4;14), or del17p in an independent validation cohort of 345 patients. Conclusions We present a new approach to identify spike genes and their relationship to patients’ survival. PMID:22102711

  8. Functional overexpression and characterization of lipogenesis-related genes in the oleaginous yeast Yarrowia lipolytica.

    PubMed

    Silverman, Andrew M; Qiao, Kangjian; Xu, Peng; Stephanopoulos, Gregory

    2016-04-01

    Single cell oil (SCO) is an attractive energy source due to scalability, utilization of low-cost renewable feedstocks, and type of product(s) made. Engineering strains capable of producing high lipid titers and yields is crucial to the economic viability of these processes. However, lipid synthesis in cells is a complex phenomenon subject to multiple layers of regulation, making gene target identification a challenging task. In this study, we aimed to identify genes in the oleaginous yeast Yarrowia lipolytica whose overexpression enhances lipid production by this organism. To this end, we examined the effect of the overexpression of a set of 44 native genes on lipid production in Y. lipolytica, including those involved in glycerolipid synthesis, fatty acid synthesis, central carbon metabolism, NADPH generation, regulation, and metabolite transport and characterized each resulting strain's ability to produce lipids growing on both glucose and acetate as a sole carbon source. Our results suggest that a diverse subset of genes was effective at individually influencing lipid production in Y. lipolytica, sometimes in a substrate-dependent manner. The most productive strain on glucose overexpressed the diacylglycerol acyltransferase DGA2 gene, increasing lipid titer, cellular content, and yield by 236, 165, and 246 %, respectively, over our control strain. On acetate, our most productive strain overexpressed the acylglycerol-phosphate acyltransferase SLC1 gene, with a lipid titer, cellular content, and yield increase of 99, 91, and 151 %, respectively, over the control strain. Aside from genes encoding enzymes that directly catalyze the reactions of lipid synthesis, other ways by which lipogenesis was increased in these cells include overexpressing the glycerol-3-phosphate dehydrogenase (GPD1) gene to increase production of glycerol head groups and overexpressing the 6-phosphogluconolactonase (SOL3) gene from the oxidative pentose phosphate pathway to increase NADPH availability for fatty acid synthesis. Taken together, our study demonstrates that the overall kinetics of microbial lipid synthesis is sensitive to a wide variety of factors. Fully optimizing a strain for single cell oil processes could involve manipulating and balancing many of these factors, and, due to mechanistic differences by which each gene product investigated here impacts lipid synthesis, there is a high likelihood that many of these genes will work synergistically to further increase lipid production when simultaneously overexpressed.

  9. Effect of multiple-source entry on price competition after patent expiration in the pharmaceutical industry.

    PubMed Central

    Suh, D C; Manning, W G; Schondelmeyer, S; Hadsall, R S

    2000-01-01

    OBJECTIVE: To analyze the effect of multiple-source drug entry on price competition after patent expiration in the pharmaceutical industry. DATA SOURCES: Originators and their multiple-source drugs selected from the 35 chemical entities whose patents expired from 1984 through 1987. Data were obtained from various primary and secondary sources for the patents' expiration dates, sales volume and units sold, and characteristics of drugs in the sample markets. STUDY DESIGN: The study was designed to determine significant factors using the study model developed under the assumption that the off-patented market is an imperfectly segmented market. PRINCIPAL FINDINGS: After patent expiration, the originators' prices continued to increase, while the price of multiple-source drugs decreased significantly over time. By the fourth year after patent expiration, originators' sales had decreased 12 percent in dollars and 30 percent in quantity. Multiple-source drugs increased their sales twofold in dollars and threefold in quantity, and possessed about one-fourth (in dollars) and half (in quantity) of the total market three years after entry. CONCLUSION: After patent expiration, multiple-source drugs compete largely with other multiple-source drugs in the price-sensitive sector, but indirectly with the originator in the price-insensitive sector. Originators have first-mover advantages, and therefore have a market that is less price sensitive after multiple-source drugs enter. On the other hand, multiple-source drugs target the price-sensitive sector, using their lower-priced drugs. This trend may indicate that the off-patented market is imperfectly segmented between the price-sensitive and insensitive sector. Consumers as a whole can gain from the entry of multiple-source drugs because the average price of the market continually declines after patent expiration. PMID:10857475

  10. A computational genomics pipeline for prokaryotic sequencing projects

    PubMed Central

    Kislyuk, Andrey O.; Katz, Lee S.; Agrawal, Sonia; Hagen, Matthew S.; Conley, Andrew B.; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C.; Sammons, Scott A.; Govil, Dhwani; Mair, Raydel D.; Tatti, Kathleen M.; Tondella, Maria L.; Harcourt, Brian H.; Mayer, Leonard W.; Jordan, I. King

    2010-01-01

    Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. Results: We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. Availability and implementation: The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems. Contact: king.jordan@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20519285

  11. SigmaS controls multiple pathways associated with intracellular multiplication of Legionella pneumophila.

    PubMed

    Hovel-Miner, Galadriel; Pampou, Sergey; Faucher, Sebastien P; Clarke, Margaret; Morozova, Irina; Morozov, Pavel; Russo, James J; Shuman, Howard A; Kalachikov, Sergey

    2009-04-01

    Legionella pneumophila is the causative agent of the severe and potentially fatal pneumonia Legionnaires' disease. L. pneumophila is able to replicate within macrophages and protozoa by establishing a replicative compartment in a process that requires the Icm/Dot type IVB secretion system. The signals and regulatory pathways required for Legionella infection and intracellular replication are poorly understood. Mutation of the rpoS gene, which encodes sigma(S), does not affect growth in rich medium but severely decreases L. pneumophila intracellular multiplication within protozoan hosts. To gain insight into the intracellular multiplication defect of an rpoS mutant, we examined its pattern of gene expression during exponential and postexponential growth. We found that sigma(S) affects distinct groups of genes that contribute to Legionella intracellular multiplication. We demonstrate that rpoS mutants have a functional Icm/Dot system yet are defective for the expression of many genes encoding Icm/Dot-translocated substrates. We also show that sigma(S) affects the transcription of the cpxR and pmrA genes, which encode two-component response regulators that directly affect the transcription of Icm/Dot substrates. Our characterization of the L. pneumophila small RNA csrB homologs, rsmY and rsmZ, introduces a link between sigma(S) and the posttranscriptional regulator CsrA. We analyzed the network of sigma(S)-controlled genes by mutational analysis of transcriptional regulators affected by sigma(S). One of these, encoding the L. pneumophila arginine repressor homolog gene, argR, is required for maximal intracellular growth in amoebae. These data show that sigma(S) is a key regulator of multiple pathways required for L. pneumophila intracellular multiplication.

  12. Human adipocytes are highly sensitive to intermittent hypoxia induced NF-kappaB activity and subsequent inflammatory gene expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Taylor, Cormac T.; Kent, Brian D.; Crinion, Sophie J.

    Highlights: • Intermittent hypoxia (IH) leads to NF-κB activation in human primary adipocytes. • Adipocytes bear higher pro-inflammatory potential than other human primary cells. • IH leads to upregulation of multiple pro-inflammatory genes in human adipocytes. - Abstract: Introduction: Intermittent hypoxia (IH)-induced activation of pro-inflammatory pathways is a major contributing factor to the cardiovascular pathophysiology associated with obstructive sleep apnea (OSA). Obesity is commonly associated with OSA although it remains unknown whether adipose tissue is a major source of inflammatory mediators in response to IH. The aim of this study was to test the hypothesis that IH leads to augmentedmore » inflammatory responses in human adipocytes when compared to cells of non-adipocyte lineages. Methods and results: Human primary subcutaneous and visceral adipocytes, human primary microvascular pulmonary endothelial cells (HUMEC-L) and human primary small airway epithelial cells (SAEC) were exposed to 0, 6 or 12 cycles of IH or stimulated with tumor necrosis factor (TNF)-α. IH led to a robust increase in NF-κB DNA-binding activity in adipocytes compared with normoxic controls regardless of whether the source of adipocytes was visceral or subcutaneous. Notably, the NF-κB response of adipocytes to both IH and TNF-α was significantly greater than that in HUMEC-L and SAEC. Western blotting confirmed enhanced nuclear translocation of p65 in adipocytes in response to IH, accompanied by phosphorylation of I-κB. Parallel to p65 activation, we observed a significant increase in secretion of the adipokines interleukin (IL)-8, IL-6 and TNF-α with IH in adipocytes accompanied by significant upregulation of mRNA expression. PCR-array suggested profound influence of IH on pro-inflammatory gene expression in adipocytes. Conclusion: Human adipocytes demonstrate strong sensitivity to inflammatory gene expression in response to acute IH and hence, adipose tissue may be a key source of inflammatory mediators in OSA.« less

  13. Targeted and efficient transfer of multiple value-added genes into wheat varieties

    USDA-ARS?s Scientific Manuscript database

    With an objective to optimize an approach to transfer multiple value added genes to a wheat variety while maintaining and improving agronomic performance, two alleles with mutations in the acetolactate synthase (ALS) gene located on wheat chromosomes 6B and 6D providing tolerance to imidazolinone (I...

  14. The BioGRID Interaction Database: 2011 update

    PubMed Central

    Stark, Chris; Breitkreutz, Bobby-Joe; Chatr-aryamontri, Andrew; Boucher, Lorrie; Oughtred, Rose; Livstone, Michael S.; Nixon, Julie; Van Auken, Kimberly; Wang, Xiaodong; Shi, Xiaoqi; Reguly, Teresa; Rust, Jennifer M.; Winter, Andrew; Dolinski, Kara; Tyers, Mike

    2011-01-01

    The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions. PMID:21071413

  15. A Novel Tightly Regulated Gene Expression System for the Human Intestinal Symbiont Bacteroides thetaiotaomicron.

    PubMed

    Horn, Nikki; Carvalho, Ana L; Overweg, Karin; Wegmann, Udo; Carding, Simon R; Stentz, Régis

    2016-01-01

    There is considerable interest in studying the function of Bacteroides species resident in the human gastrointestinal (GI)-tract and the contribution they make to host health. Reverse genetics and protein expression techniques, such as those developed for well-characterized Escherichia coli cannot be applied to Bacteroides species as they and other members of the Bacteriodetes phylum have unique promoter structures. The availability of useful Bacteroides-specific genetic tools is therefore limited. Here we describe the development of an effective mannan-controlled gene expression system for Bacteroides thetaiotaomicron containing the mannan-inducible promoter-region of an α-1,2-mannosidase gene (BT_3784), a ribosomal binding site designed to modulate expression, a multiple cloning site to facilitate the cloning of genes of interest, and a transcriptional terminator. Using the Lactobacillus pepI as a reporter gene, mannan induction resulted in an increase of reporter activity in a time- and concentration-dependent manner with a wide range of activity. The endogenous BtcepA cephalosporinase gene was used to demonstrate the suitability of this novel expression system, enabling the isolation of a His-tagged version of BtCepA. We have also shown with experiments performed in mice that the system can be induced in vivo in the presence of an exogenous source of mannan. By enabling the controlled expression of endogenous and exogenous genes in B. thetaiotaomicron this novel inducer-dependent expression system will aid in defining the physiological role of individual genes and the functional analyses of their products.

  16. Cross-species multiple environmental stress responses: An integrated approach to identify candidate genes for multiple stress tolerance in sorghum (Sorghum bicolor (L.) Moench) and related model species

    PubMed Central

    Modise, David M.; Gemeildien, Junaid; Ndimba, Bongani K.; Christoffels, Alan

    2018-01-01

    Background Crop response to the changing climate and unpredictable effects of global warming with adverse conditions such as drought stress has brought concerns about food security to the fore; crop yield loss is a major cause of concern in this regard. Identification of genes with multiple responses across environmental stresses is the genetic foundation that leads to crop adaptation to environmental perturbations. Methods In this paper, we introduce an integrated approach to assess candidate genes for multiple stress responses across-species. The approach combines ontology based semantic data integration with expression profiling, comparative genomics, phylogenomics, functional gene enrichment and gene enrichment network analysis to identify genes associated with plant stress phenotypes. Five different ontologies, viz., Gene Ontology (GO), Trait Ontology (TO), Plant Ontology (PO), Growth Ontology (GRO) and Environment Ontology (EO) were used to semantically integrate drought related information. Results Target genes linked to Quantitative Trait Loci (QTLs) controlling yield and stress tolerance in sorghum (Sorghum bicolor (L.) Moench) and closely related species were identified. Based on the enriched GO terms of the biological processes, 1116 sorghum genes with potential responses to 5 different stresses, such as drought (18%), salt (32%), cold (20%), heat (8%) and oxidative stress (25%) were identified to be over-expressed. Out of 169 sorghum drought responsive QTLs associated genes that were identified based on expression datasets, 56% were shown to have multiple stress responses. On the other hand, out of 168 additional genes that have been evaluated for orthologous pairs, 90% were conserved across species for drought tolerance. Over 50% of identified maize and rice genes were responsive to drought and salt stresses and were co-located within multifunctional QTLs. Among the total identified multi-stress responsive genes, 272 targets were shown to be co-localized within QTLs associated with different traits that are responsive to multiple stresses. Ontology mapping was used to validate the identified genes, while reconstruction of the phylogenetic tree was instrumental to infer the evolutionary relationship of the sorghum orthologs. The results also show specific genes responsible for various interrelated components of drought response mechanism such as drought tolerance, drought avoidance and drought escape. Conclusions We submit that this approach is novel and to our knowledge, has not been used previously in any other research; it enables us to perform cross-species queries for genes that are likely to be associated with multiple stress tolerance, as a means to identify novel targets for engineering stress resistance in sorghum and possibly, in other crop species. PMID:29590108

  17. Long genes and genes with multiple splice variants are enriched in pathways linked to cancer and other multigenic diseases.

    PubMed

    Sahakyan, Aleksandr B; Balasubramanian, Shankar

    2016-03-12

    The role of random mutations and genetic errors in defining the etiology of cancer and other multigenic diseases has recently received much attention. With the view that complex genes should be particularly vulnerable to such events, here we explore the link between the simple properties of the human genes, such as transcript length, number of splice variants, exon/intron composition, and their involvement in the pathways linked to cancer and other multigenic diseases. We reveal a substantial enrichment of cancer pathways with long genes and genes that have multiple splice variants. Although the latter two factors are interdependent, we show that the overall gene length and splicing complexity increase in cancer pathways in a partially decoupled manner. Our systematic survey for the pathways enriched with top lengthy genes and with genes that have multiple splice variants reveal, along with cancer pathways, the pathways involved in various neuronal processes, cardiomyopathies and type II diabetes. We outline a correlation between the gene length and the number of somatic mutations. Our work is a step forward in the assessment of the role of simple gene characteristics in cancer and a wider range of multigenic diseases. We demonstrate a significant accumulation of long genes and genes with multiple splice variants in pathways of multigenic diseases that have already been associated with de novo mutations. Unlike the cancer pathways, we note that the pathways of neuronal processes, cardiomyopathies and type II diabetes contain genes long enough for topoisomerase-dependent gene expression to also be a potential contributing factor in the emergence of pathologies, should topoisomerases become impaired.

  18. Assessment of the Toxicity of CuO Nanoparticles by Using Saccharomyces cerevisiae Mutants with Multiple Genes Deleted

    PubMed Central

    Bao, Shaopan; Lu, Qicong; Dai, Heping; Zhang, Chao

    2015-01-01

    To develop applicable and susceptible models to evaluate the toxicity of nanoparticles, the antimicrobial effects of CuO nanoparticles (CuO-NPs) on various Saccharomyces cerevisiae (S. cerevisiae) strains (wild type, single-gene-deleted mutants, and multiple-gene-deleted mutants) were determined and compared. Further experiments were also conducted to analyze the mechanisms associated with toxicity using copper salt, bulk CuO (bCuO), carbon-shelled copper nanoparticles (C/Cu-NPs), and carbon nanoparticles (C-NPs) for comparisons. The results indicated that the growth inhibition rates of CuO-NPs for the wild-type and the single-gene-deleted strains were comparable, while for the multiple-gene deletion mutant, significantly higher toxicity was observed (P < 0.05). When the toxicity of the CuO-NPs to yeast cells was compared with the toxicities of copper salt and bCuO, we concluded that the toxicity of CuO-NPs should be attributed to soluble copper rather than to the nanoparticles. The striking difference in adverse effects of C-NPs and C/Cu-NPs with equivalent surface areas also proved this. A toxicity assay revealed that the multiple-gene-deleted mutant was significantly more sensitive to CuO-NPs than the wild type. Specifically, compared with the wild-type strain, copper was readily taken up by mutant strains when cell permeability genes were knocked out, and the mutants with deletions of genes regulated under oxidative stress (OS) were likely producing more reactive oxygen species (ROS). Hence, as mechanism-based gene inactivation could increase the susceptibility of yeast, the multiple-gene-deleted mutants should be improved model organisms to investigate the toxicity of nanoparticles. PMID:26386067

  19. Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM).

    PubMed

    Skinnider, Michael A; Dejong, Chris A; Rees, Philip N; Johnston, Chad W; Li, Haoxin; Webster, Andrew L H; Wyatt, Morgan A; Magarvey, Nathan A

    2015-11-16

    Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Multiple Household Water Sources and Their Use in Remote Communities With Evidence From Pacific Island Countries

    NASA Astrophysics Data System (ADS)

    Elliott, Mark; MacDonald, Morgan C.; Chan, Terence; Kearton, Annika; Shields, Katherine F.; Bartram, Jamie K.; Hadwen, Wade L.

    2017-11-01

    Global water research and monitoring typically focus on the household's "main source of drinking-water." Use of multiple water sources to meet daily household needs has been noted in many developing countries but rarely quantified or reported in detail. We gathered self-reported data using a cross-sectional survey of 405 households in eight communities of the Republic of the Marshall Islands (RMI) and five Solomon Islands (SI) communities. Over 90% of households used multiple sources, with differences in sources and uses between wet and dry seasons. Most RMI households had large rainwater tanks and rationed stored rainwater for drinking throughout the dry season, whereas most SI households collected rainwater in small pots, precluding storage across seasons. Use of a source for cooking was strongly positively correlated with use for drinking, whereas use for cooking was negatively correlated or uncorrelated with nonconsumptive uses (e.g., bathing). Dry season water uses implied greater risk of water-borne disease, with fewer (frequently zero) handwashing sources reported and more unimproved sources consumed. Use of multiple sources is fundamental to household water management and feasible to monitor using electronic survey tools. We contend that recognizing multiple water sources can greatly improve understanding of household-level and community-level climate change resilience, that use of multiple sources confounds health impact studies of water interventions, and that incorporating multiple sources into water supply interventions can yield heretofore-unrealized benefits. We propose that failure to consider multiple sources undermines the design and effectiveness of global water monitoring, data interpretation, implementation, policy, and research.

  1. Tensor decomposition-based and principal-component-analysis-based unsupervised feature extraction applied to the gene expression and methylation profiles in the brains of social insects with multiple castes.

    PubMed

    Taguchi, Y-H

    2018-05-08

    Even though coexistence of multiple phenotypes sharing the same genomic background is interesting, it remains incompletely understood. Epigenomic profiles may represent key factors, with unknown contributions to the development of multiple phenotypes, and social-insect castes are a good model for elucidation of the underlying mechanisms. Nonetheless, previous studies have failed to identify genes associated with aberrant gene expression and methylation profiles because of the lack of suitable methodology that can address this problem properly. A recently proposed principal component analysis (PCA)-based and tensor decomposition (TD)-based unsupervised feature extraction (FE) can solve this problem because these two approaches can deal with gene expression and methylation profiles even when a small number of samples is available. PCA-based and TD-based unsupervised FE methods were applied to the analysis of gene expression and methylation profiles in the brains of two social insects, Polistes canadensis and Dinoponera quadriceps. Genes associated with differential expression and methylation between castes were identified, and analysis of enrichment of Gene Ontology terms confirmed reliability of the obtained sets of genes from the biological standpoint. Biologically relevant genes, shown to be associated with significant differential gene expression and methylation between castes, were identified here for the first time. The identification of these genes may help understand the mechanisms underlying epigenetic control of development of multiple phenotypes under the same genomic conditions.

  2. Joint principal trend analysis for longitudinal high-dimensional data.

    PubMed

    Zhang, Yuping; Ouyang, Zhengqing

    2018-06-01

    We consider a research scenario motivated by integrating multiple sources of information for better knowledge discovery in diverse dynamic biological processes. Given two longitudinal high-dimensional datasets for a group of subjects, we want to extract shared latent trends and identify relevant features. To solve this problem, we present a new statistical method named as joint principal trend analysis (JPTA). We demonstrate the utility of JPTA through simulations and applications to gene expression data of the mammalian cell cycle and longitudinal transcriptional profiling data in response to influenza viral infections. © 2017, The International Biometric Society.

  3. Free-living amoebae: Health concerns in the indoor environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tyndall, R.L.; Ironside, K.S.

    1990-01-01

    Free-living amoebae are the most likely protozoa implicated in health concerns of the indoor environment. These amoebae can be the source of allergic reactions, eye infections or, on rare occasions, encephalitis. While too large to be effectively aerosolized, free- living amoebae can support the multiplication of pathogens such as Legionella which are easily aerosolized and infectious via the pulmonary route. Traditional detection methods for free-living amoebae are laborious and time consuming. Newer techniques for rapidly detecting and quantitating free-living amoebae such as monoclonal antibodies, flow cytometry, gene probes, and laser optics have or could be employed. 25 refs.

  4. Rcount: simple and flexible RNA-Seq read counting.

    PubMed

    Schmid, Marc W; Grossniklaus, Ueli

    2015-02-01

    Analysis of differential gene expression by RNA sequencing (RNA-Seq) is frequently done using feature counts, i.e. the number of reads mapping to a gene. However, commonly used count algorithms (e.g. HTSeq) do not address the problem of reads aligning with multiple locations in the genome (multireads) or reads aligning with positions where two or more genes overlap (ambiguous reads). Rcount specifically addresses these issues. Furthermore, Rcount allows the user to assign priorities to certain feature types (e.g. higher priority for protein-coding genes compared to rRNA-coding genes) or to add flanking regions. Rcount provides a fast and easy-to-use graphical user interface requiring no command line or programming skills. It is implemented in C++ using the SeqAn (www.seqan.de) and the Qt libraries (qt-project.org). Source code and 64 bit binaries for (Ubuntu) Linux, Windows (7) and MacOSX are released under the GPLv3 license and are freely available on github.com/MWSchmid/Rcount. marcschmid@gmx.ch Test data, genome annotation files, useful Python and R scripts and a step-by-step user guide (including run-time and memory usage tests) are available on github.com/MWSchmid/Rcount. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

    PubMed Central

    Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko

    2015-01-01

    Abstract Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. PMID:25948579

  6. A survey of crop-derived transgenes in activated and digester sludges in wastewater treatment plants in the United States.

    PubMed

    Gardner, Courtney M; Gwin, Carley A; Gunsch, Claudia K

    2018-04-01

    The use of transgenic crops has become increasingly common in the United States over the last several decades. Increasing evidence suggests that DNA may be protected from enzymatic digestion and acid hydrolysis in the digestive tract, suggesting that crop-derived transgenes may enter into wastewater treatment plants (WWTPs) intact. Given the historical use of antibiotic resistance genes as selection markers in transgenic crop development, it is important to consider the fate of these transgenes. Herein we detected and quantified crop-derived transgenes in WWTPs. All viable US WWTP samples were found to contain multiple gene targets (p35, nos, bla and nptII) at significantly higher levels than control samples. Control wastewater samples obtained from France, where transgenic crops are not cultivated, contained significantly fewer copies of the nptII gene than US activated and digester sludges. No significant differences were measured for the bla antibiotic resistance gene (ARG). In addition, a nested PCR (polymerase chain reaction) assay was developed that targeted the bla ARG located in regions flanked by the p35 promoter and nos terminator. Overall this work suggests that transgenic crops may have provided an environmental source of nptII; however, follow-up studies are needed to ascertain the viability of these genes as they exit WWTPs.

  7. Integrative Genomic Analyses Yields Cell Cycle Regulatory Programs with Prognostic Value

    PubMed Central

    Cheng, Chao; Lou, Shaoke; Andrews, Erik H.; Ung, Matthew H.; Varn, Frederick S.

    2016-01-01

    Liposarcoma is the second most common form of sarcoma, which has been categorized into four molecular subtypes, which are associated with differential prognosis of patients. However, the transcriptional regulatory programs associated with distinct histological and molecular subtypes of liposarcoma have not been investigated. This study uses integrative analyses to systematically define the transcriptional regulatory programs associated with liposarcoma. Likewise, computational methods are used to identify regulatory programs associated with different liposarcoma subtypes as well as programs that are predictive of prognosis. Further analysis of curated gene sets was used to identify prognostic gene signatures. The integration of data from a variety sources including gene expression profiles, transcription factor (TF) binding data from ChIP-seq experiments, curated gene sets, and clinical information of patients indicated discrete regulatory programs (e.g., controlled by E2F1 and E2F4) with significantly different regulatory activity in one or multiple subtypes of liposarcoma with respect to normal adipose tissue. These programs were also shown to be prognostic, wherein liposarcoma patients with higher E2F4 or E2F1 activity associated with unfavorable prognosis. A total of 259 gene sets were significantly associated with patient survival in liposarcoma, among which >50% are involved in cell cycle and proliferation. PMID:26856934

  8. WormQTLHD—a web database for linking human disease to natural variation data in C. elegans

    PubMed Central

    van der Velde, K. Joeri; de Haan, Mark; Zych, Konrad; Arends, Danny; Snoek, L. Basten; Kammenga, Jan E.; Jansen, Ritsert C.; Swertz, Morris A.; Li, Yang

    2014-01-01

    Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism—Caenorhabditis elegans—has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTLHD (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene–disease associations in man. WormQTLHD, available online at http://www.wormqtl-hd.org, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene–disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench. PMID:24217915

  9. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    PubMed

    Sharma, Parichit; Mantri, Shrikant S

    2014-01-01

    The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design decisions, describe workflows and provide a detailed analysis.

  10. WImpiBLAST: Web Interface for mpiBLAST to Help Biologists Perform Large-Scale Annotation Using High Performance Computing

    PubMed Central

    Sharma, Parichit; Mantri, Shrikant S.

    2014-01-01

    The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design decisions, describe workflows and provide a detailed analysis. PMID:24979410

  11. Next-generation analysis of cataracts: determining knowledge driven gene-gene interactions using Biofilter, and gene-environment interactions using the PhenX Toolkit.

    PubMed

    Pendergrass, Sarah A; Verma, Shefali S; Holzinger, Emily R; Moore, Carrie B; Wallace, John; Dudek, Scott M; Huggins, Wayne; Kitchner, Terrie; Waudby, Carol; Berg, Richard; McCarty, Catherine A; Ritchie, Marylyn D

    2013-01-01

    Investigating the association between biobank derived genomic data and the information of linked electronic health records (EHRs) is an emerging area of research for dissecting the architecture of complex human traits, where cases and controls for study are defined through the use of electronic phenotyping algorithms deployed in large EHR systems. For our study, 2580 cataract cases and 1367 controls were identified within the Marshfield Personalized Medicine Research Project (PMRP) Biobank and linked EHR, which is a member of the NHGRI-funded electronic Medical Records and Genomics (eMERGE) Network. Our goal was to explore potential gene-gene and gene-environment interactions within these data for 529,431 single nucleotide polymorphisms (SNPs) with minor allele frequency > 1%, in order to explore higher level associations with cataract risk beyond investigations of single SNP-phenotype associations. To build our SNP-SNP interaction models we utilized a prior-knowledge driven filtering method called Biofilter to minimize the multiple testing burden of exploring the vast array of interaction models possible from our extensive number of SNPs. Using the Biofilter, we developed 57,376 prior-knowledge directed SNP-SNP models to test for association with cataract status. We selected models that required 6 sources of external domain knowledge. We identified 5 statistically significant models with an interaction term with p-value < 0.05, as well as an overall model with p-value < 0.05 associated with cataract status. We also conducted gene-environment interaction analyses for all GWAS SNPs and a set of environmental factors from the PhenX Toolkit: smoking, UV exposure, and alcohol use; these environmental factors have been previously associated with the formation of cataracts. We found a total of 288 models that exhibit an interaction term with a p-value ≤ 1×10(-4) associated with cataract status. Our results show these approaches enable advanced searches for epistasis and gene-environment interactions beyond GWAS, and that the EHR based approach provides an additional source of data for seeking these advanced explanatory models of the etiology of complex disease/outcome such as cataracts.

  12. Anatomy of a nonhost disease resistance response of pea to Fusarium solani: PR gene elicitation via DNase, chitosan and chromatin alterations

    PubMed Central

    Hadwiger, Lee A.

    2015-01-01

    Of the multiplicity of plant pathogens in nature, only a few are virulent on a given plant species. Conversely, plants develop a rapid “nonhost” resistance response to the majority of the pathogens. The anatomy of the nonhost resistance of pea endocarp tissue against a pathogen of bean, Fusarium solani f.sp. phaseoli (Fsph) and the susceptibility of pea to F. solani f sp. pisi (Fspi) has been described cytologically, biochemically and molecular-biologically. Cytological changes have been followed by electron microscope and stain differentiation under white and UV light. The induction of changes in transcription, protein synthesis, expression of pathogenesis-related (PR) genes, and increases in metabolic pathways culminating in low molecular weight, antifungal compounds are described biochemically. Molecular changes initiated by fungal signals to host organelles, primarily to chromatin within host nuclei, are identified according to source of the signal and the mechanisms utilized in activating defense genes. The functions of some PR genes are defined. A hypothesis based on this data is developed to explain both why fungal growth is suppressed in nonhost resistance and why growth can continue in a susceptible reaction. PMID:26124762

  13. The Phylogeny of Rickettsia Using Different Evolutionary Signatures: How Tree-Like is Bacterial Evolution?

    PubMed Central

    Murray, Gemma G. R.; Weinert, Lucy A.; Rhule, Emma L.; Welch, John J.

    2016-01-01

    Rickettsia is a genus of intracellular bacteria whose hosts and transmission strategies are both impressively diverse, and this is reflected in a highly dynamic genome. Some previous studies have described the evolutionary history of Rickettsia as non-tree-like, due to incongruity between phylogenetic reconstructions using different portions of the genome. Here, we reconstruct the Rickettsia phylogeny using whole-genome data, including two new genomes from previously unsampled host groups. We find that a single topology, which is supported by multiple sources of phylogenetic signal, well describes the evolutionary history of the core genome. We do observe extensive incongruence between individual gene trees, but analyses of simulations over a single topology and interspersed partitions of sites show that this is more plausibly attributed to systematic error than to horizontal gene transfer. Some conflicting placements also result from phylogenetic analyses of accessory genome content (i.e., gene presence/absence), but we argue that these are also due to systematic error, stemming from convergent genome reduction, which cannot be accommodated by existing phylogenetic methods. Our results show that, even within a single genus, tests for gene exchange based on phylogenetic incongruence may be susceptible to false positives. PMID:26559010

  14. Anatomy of a nonhost disease resistance response of pea to Fusarium solani: PR gene elicitation via DNase, chitosan and chromatin alterations.

    PubMed

    Hadwiger, Lee A

    2015-01-01

    Of the multiplicity of plant pathogens in nature, only a few are virulent on a given plant species. Conversely, plants develop a rapid "nonhost" resistance response to the majority of the pathogens. The anatomy of the nonhost resistance of pea endocarp tissue against a pathogen of bean, Fusarium solani f.sp. phaseoli (Fsph) and the susceptibility of pea to F. solani f sp. pisi (Fspi) has been described cytologically, biochemically and molecular-biologically. Cytological changes have been followed by electron microscope and stain differentiation under white and UV light. The induction of changes in transcription, protein synthesis, expression of pathogenesis-related (PR) genes, and increases in metabolic pathways culminating in low molecular weight, antifungal compounds are described biochemically. Molecular changes initiated by fungal signals to host organelles, primarily to chromatin within host nuclei, are identified according to source of the signal and the mechanisms utilized in activating defense genes. The functions of some PR genes are defined. A hypothesis based on this data is developed to explain both why fungal growth is suppressed in nonhost resistance and why growth can continue in a susceptible reaction.

  15. MulRF: a software package for phylogenetic analysis using multi-copy gene trees.

    PubMed

    Chaudhary, Ruchi; Fernández-Baca, David; Burleigh, John Gordon

    2015-02-01

    MulRF is a platform-independent software package for phylogenetic analysis using multi-copy gene trees. It seeks the species tree that minimizes the Robinson-Foulds (RF) distance to the input trees using a generalization of the RF distance to multi-labeled trees. The underlying generic tree distance measure and fast running time make MulRF useful for inferring phylogenies from large collections of gene trees, in which multiple evolutionary processes as well as phylogenetic error may contribute to gene tree discord. MulRF implements several features for customizing the species tree search and assessing the results, and it provides a user-friendly graphical user interface (GUI) with tree visualization. The species tree search is implemented in C++ and the GUI in Java Swing. MulRF's executable as well as sample datasets and manual are available at http://genome.cs.iastate.edu/CBL/MulRF/, and the source code is available at https://github.com/ruchiherself/MulRFRepo. ruchic@ufl.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages.

    PubMed

    Taminau, Jonatan; Meganck, Stijn; Lazar, Cosmin; Steenhoff, David; Coletta, Alain; Molter, Colin; Duque, Robin; de Schaetzen, Virginie; Weiss Solís, David Y; Bersini, Hugues; Nowé, Ann

    2012-12-24

    With an abundant amount of microarray gene expression data sets available through public repositories, new possibilities lie in combining multiple existing data sets. In this new context, analysis itself is no longer the problem, but retrieving and consistently integrating all this data before delivering it to the wide variety of existing analysis tools becomes the new bottleneck. We present the newly released inSilicoMerging R/Bioconductor package which, together with the earlier released inSilicoDb R/Bioconductor package, allows consistent retrieval, integration and analysis of publicly available microarray gene expression data sets. Inside the inSilicoMerging package a set of five visual and six quantitative validation measures are available as well. By providing (i) access to uniformly curated and preprocessed data, (ii) a collection of techniques to remove the batch effects between data sets from different sources, and (iii) several validation tools enabling the inspection of the integration process, these packages enable researchers to fully explore the potential of combining gene expression data for downstream analysis. The power of using both packages is demonstrated by programmatically retrieving and integrating gene expression studies from the InSilico DB repository [https://insilicodb.org/app/].

  17. ReNE: A Cytoscape Plugin for Regulatory Network Enhancement

    PubMed Central

    Politano, Gianfranco; Benso, Alfredo; Savino, Alessandro; Di Carlo, Stefano

    2014-01-01

    One of the biggest challenges in the study of biological regulatory mechanisms is the integration, americanmodeling, and analysis of the complex interactions which take place in biological networks. Despite post transcriptional regulatory elements (i.e., miRNAs) are widely investigated in current research, their usage and visualization in biological networks is very limited. Regulatory networks are commonly limited to gene entities. To integrate networks with post transcriptional regulatory data, researchers are therefore forced to manually resort to specific third party databases. In this context, we introduce ReNE, a Cytoscape 3.x plugin designed to automatically enrich a standard gene-based regulatory network with more detailed transcriptional, post transcriptional, and translational data, resulting in an enhanced network that more precisely models the actual biological regulatory mechanisms. ReNE can automatically import a network layout from the Reactome or KEGG repositories, or work with custom pathways described using a standard OWL/XML data format that the Cytoscape import procedure accepts. Moreover, ReNE allows researchers to merge multiple pathways coming from different sources. The merged network structure is normalized to guarantee a consistent and uniform description of the network nodes and edges and to enrich all integrated data with additional annotations retrieved from genome-wide databases like NCBI, thus producing a pathway fully manageable through the Cytoscape environment. The normalized network is then analyzed to include missing transcription factors, miRNAs, and proteins. The resulting enhanced network is still a fully functional Cytoscape network where each regulatory element (transcription factor, miRNA, gene, protein) and regulatory mechanism (up-regulation/down-regulation) is clearly visually identifiable, thus enabling a better visual understanding of its role and the effect in the network behavior. The enhanced network produced by ReNE is exportable in multiple formats for further analysis via third party applications. ReNE can be freely installed from the Cytoscape App Store (http://apps.cytoscape.org/apps/rene) and the full source code is freely available for download through a SVN repository accessible at http://www.sysbio.polito.it/tools_svn/BioInformatics/Rene/releases/. ReNE enhances a network by only integrating data from public repositories, without any inference or prediction. The reliability of the introduced interactions only depends on the reliability of the source data, which is out of control of ReNe developers. PMID:25541727

  18. Regression Models for the Analysis of Longitudinal Gaussian Data from Multiple Sources

    PubMed Central

    O’Brien, Liam M.; Fitzmaurice, Garrett M.

    2006-01-01

    We present a regression model for the joint analysis of longitudinal multiple source Gaussian data. Longitudinal multiple source data arise when repeated measurements are taken from two or more sources, and each source provides a measure of the same underlying variable and on the same scale. This type of data generally produces a relatively large number of observations per subject; thus estimation of an unstructured covariance matrix often may not be possible. We consider two methods by which parsimonious models for the covariance can be obtained for longitudinal multiple source data. The methods are illustrated with an example of multiple informant data arising from a longitudinal interventional trial in psychiatry. PMID:15726666

  19. Developing Pedagogical Tools to Improve Teaching Multiple Models of the Gene in High School

    ERIC Educational Resources Information Center

    Auckaraaree, Nantaya

    2013-01-01

    Multiple models of the gene are used to explore genetic phenomena in scientific practices and in the classroom. In genetics curricula, the classical and molecular models are presented in disconnected domains. Research demonstrates that, without explicit connections, students have difficulty developing an understanding of the gene that spans…

  20. Array data extractor (ADE): a LabVIEW program to extract and merge gene array data.

    PubMed

    Kurtenbach, Stefan; Kurtenbach, Sarah; Zoidl, Georg

    2013-12-01

    Large data sets from gene expression array studies are publicly available offering information highly valuable for research across many disciplines ranging from fundamental to clinical research. Highly advanced bioinformatics tools have been made available to researchers, but a demand for user-friendly software allowing researchers to quickly extract expression information for multiple genes from multiple studies persists. Here, we present a user-friendly LabVIEW program to automatically extract gene expression data for a list of genes from multiple normalized microarray datasets. Functionality was tested for 288 class A G protein-coupled receptors (GPCRs) and expression data from 12 studies comparing normal and diseased human hearts. Results confirmed known regulation of a beta 1 adrenergic receptor and further indicate novel research targets. Although existing software allows for complex data analyses, the LabVIEW based program presented here, "Array Data Extractor (ADE)", provides users with a tool to retrieve meaningful information from multiple normalized gene expression datasets in a fast and easy way. Further, the graphical programming language used in LabVIEW allows applying changes to the program without the need of advanced programming knowledge.

  1. Detection of Extended Spectrum Beta-Lactamases Resistance Genes among Bacteria Isolated from Selected Drinking Water Distribution Channels in Southwestern Nigeria.

    PubMed

    Adesoji, Ayodele T; Ogunjobi, Adeniyi A

    2016-01-01

    Extended Spectrum Beta-Lactamases (ESBL) provide high level resistance to beta-lactam antibiotics among bacteria. In this study, previously described multidrug resistant bacteria from raw, treated, and municipal taps of DWDS from selected dams in southwestern Nigeria were assessed for the presence of ESBL resistance genes which include bla TEM, bla SHV, and bla CTX by PCR amplification. A total of 164 bacteria spread across treated (33), raw (66), and municipal taps (68), belonging to α-Proteobacteria, β-Proteobacteria, γ-Proteobacteria, Flavobacteriia, Bacilli, and Actinobacteria group, were selected for this study. Among these bacteria, the most commonly observed resistance was for ampicillin and amoxicillin/clavulanic acid (61 isolates). Sixty-one isolates carried at least one of the targeted ESBL genes with bla TEM being the most abundant (50/61) and bla CTX being detected least (3/61). Klebsiella was the most frequently identified genus (18.03%) to harbour ESBL gene followed by Proteus (14.75%). Moreover, combinations of two ESBL genes, bla SHV + bla TEM or bla CTX + bla TEM, were observed in 11 and 1 isolate, respectively. In conclusion, classic bla TEM ESBL gene was present in multiple bacterial strains that were isolated from DWDS sources in Nigeria. These environments may serve as foci exchange of genetic traits in a diversity of Gram-negative bacteria.

  2. Interactions between Bt crops and aquatic ecosystems: A review.

    PubMed

    Venter, Hermoine J; Bøhn, Thomas

    2016-12-01

    The term Bt crops collectively refers to crops that have been genetically modified to include a gene (or genes) sourced from Bacillus thuringiensis (Bt) bacteria. These genes confer the ability to produce proteins toxic to certain insect pests. The interaction between Bt crops and adjacent aquatic ecosystems has received limited attention in research and risk assessment, despite the fact that some Bt crops have been in commercial use for 20 yr. Reports of effects on aquatic organisms such as Daphnia magna, Elliptio complanata, and Chironomus dilutus suggest that some aquatic species may be negatively affected, whereas other reports suggest that the decreased use of insecticides precipitated by Bt crops may benefit aquatic communities. The present study reviews the literature regarding entry routes and exposure pathways by which aquatic organisms may be exposed to Bt crop material, as well as feeding trials and field surveys that have investigated the effects of Bt-expressing plant material on such organisms. The present review also discusses how Bt crop development has moved past single-gene events, toward multigene stacked varieties that often contain herbicide resistance genes in addition to multiple Bt genes, and how their use (in conjunction with co-technology such as glyphosate/Roundup) may impact and interact with aquatic ecosystems. Lastly, suggestions for further research in this field are provided. Environ Toxicol Chem 2016;35:2891-2902. © 2016 SETAC. © 2016 SETAC.

  3. Freshwater Suspended Sediments and Sewage Are Reservoirs for Enterotoxin-Positive Clostridium perfringens▿

    PubMed Central

    Mueller-Spitz, Sabrina R.; Stewart, Lisa B.; Klump, J. Val; McLellan, Sandra L.

    2010-01-01

    The release of fecal pollution into surface waters may create environmental reservoirs of feces-derived microorganisms, including pathogens. Clostridium perfringens is a commonly used fecal indicator that represents a human pathogen. The pathogenicity of this bacterium is associated with its expression of multiple toxins; however, the prevalence of C. perfringens with various toxin genes in aquatic environments is not well characterized. In this study, C. perfringens spores were used to measure the distribution of fecal pollution associated with suspended sediments in the nearshore waters of Lake Michigan. Particle-associated C. perfringens levels were greatest adjacent to the Milwaukee harbor and diminished in the nearshore waters. Species-specific PCR and toxin gene profiles identified 174 isolates collected from the suspended sediments, surface water, and sewage influent as C. perfringens type A. Regardless of the isolation source, the beta2 and enterotoxin genes were common among isolates. The suspended sediments yielded the highest frequency of cpe-carrying C. perfringens (61%) compared to sewage (38%). Gene arrangement of enterotoxin was investigated using PCR to target known insertion sequences associated with this gene. Amplification products were detected in only 9 of 90 strains, which suggests there is greater variability in cpe gene arrangement than previously described. This work presents evidence that freshwater suspended sediments and sewage influent are reservoirs for potentially pathogenic cpe-carrying C. perfringens spores. PMID:20581181

  4. Multiple Testing in the Context of Gene Discovery in Sickle Cell Disease Using Genome-Wide Association Studies.

    PubMed

    Kuo, Kevin H M

    2017-01-01

    The issue of multiple testing, also termed multiplicity, is ubiquitous in studies where multiple hypotheses are tested simultaneously. Genome-wide association study (GWAS), a type of genetic association study that has gained popularity in the past decade, is most susceptible to the issue of multiple testing. Different methodologies have been employed to address the issue of multiple testing in GWAS. The purpose of the review is to examine the methodologies employed in dealing with multiple testing in the context of gene discovery using GWAS in sickle cell disease complications.

  5. CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets.

    PubMed

    Schofield, E C; Carver, T; Achuthan, P; Freire-Pritchett, P; Spivakov, M; Todd, J A; Burren, O S

    2016-08-15

    Promoter capture Hi-C (PCHi-C) allows the genome-wide interrogation of physical interactions between distal DNA regulatory elements and gene promoters in multiple tissue contexts. Visual integration of the resultant chromosome interaction maps with other sources of genomic annotations can provide insight into underlying regulatory mechanisms. We have developed Capture HiC Plotter (CHiCP), a web-based tool that allows interactive exploration of PCHi-C interaction maps and integration with both public and user-defined genomic datasets. CHiCP is freely accessible from www.chicp.org and supports most major HTML5 compliant web browsers. Full source code and installation instructions are available from http://github.com/D-I-L/django-chicp ob219@cam.ac.uk. © The Author 2016. Published by Oxford University Press. All rights reserved.

  6. Patterns of genomic variation in Coho salmon following reintroduction to the interior Columbia River.

    PubMed

    Campbell, Nathan R; Kamphaus, Cory; Murdoch, Keely; Narum, Shawn R

    2017-12-01

    Coho salmon were extirpated in the mid-20th century from the interior reaches of the Columbia River but were reintroduced with relatively abundant source stocks from the lower Columbia River near the Pacific coast. Reintroduction of Coho salmon to the interior Columbia River (Wenatchee River) using lower river stocks placed selective pressures on the new colonizers due to substantial differences with their original habitat such as migration distance and navigation of six additional hydropower dams. We used restriction site-associated DNA sequencing (RAD-seq) to genotype 5,392 SNPs in reintroduced Coho salmon in the Wenatchee River over four generations to test for signals of temporal structure and adaptive variation. Temporal genetic structure among the three broodlines of reintroduced fish was evident among the initial return years (2000, 2001, and 2002) and their descendants, which indicated levels of reproductive isolation among broodlines. Signals of adaptive variation were detected from multiple outlier tests and identified candidate genes for further study. This study illustrated that genetic variation and structure of reintroduced populations are likely to reflect source stocks for multiple generations but may shift over time once established in nature.

  7. Global Control of GacA in Secondary Metabolism, Primary Metabolism, Secretion Systems, and Motility in the Rhizobacterium Pseudomonas aeruginosa M18

    PubMed Central

    Wei, Xue; Tang, Lulu; Wu, Daqiang

    2013-01-01

    The rhizobacterium Pseudomonas aeruginosa M18 can produce a broad spectrum of secondary metabolites, including the antibiotics pyoluteorin (Plt) and phenazine-1-carboxylic acid (PCA), hydrogen cyanide, and the siderophores pyoverdine and pyochelin. The antibiotic biosynthesis of M18 is coordinately controlled by multiple distinct regulatory pathways, of which the GacS/GacA system activates Plt biosynthesis but strongly downregulates PCA biosynthesis. Here, we investigated the global influence of a gacA mutation on the M18 transcriptome and related metabolic and physiological processes. Transcriptome profiling revealed that the transcript levels of 839 genes, which account for approximately 15% of the annotated genes in the M18 genome, were significantly influenced by the gacA mutation during the early stationary growth phase of M18. Most secondary metabolic gene clusters, such as pvd, pch, plt, amb, and hcn, were activated by GacA. The GacA regulon also included genes encoding extracellular enzymes and cytochrome oxidases. Interestingly, the primary metabolism involved in the assimilation and metabolism of phosphorus, sulfur, and nitrogen sources was also notably regulated by GacA. Another important category of the GacA regulon was secretion systems, including H1, H2, and H3 (type VI secretion systems [T6SSs]), Hxc (T2SS), and Has and Apr (T1SSs), and CupE and Tad pili. More remarkably, GacA inhibited swimming, swarming, and twitching motilities. Taken together, the Gac-initiated global regulation, which was mostly mediated through multiple regulatory systems or factors, was mainly involved in secondary and primary metabolism, secretion systems, motility, etc., contributing to ecological or nutritional competence, ion homeostasis, and biocontrol in M18. PMID:23708134

  8. Spontaneous mutations in CYC8 and MIG1 suppress the short chronological lifespan of budding yeast lacking SNF1/AMPK

    PubMed Central

    Maqani, Nazif; Fine, Ryan D.; Shahid, Mehreen; Li, Mingguang; Enriquez-Hesles, Elisa; Smith, Jeffrey S.

    2018-01-01

    Chronologically aging yeast cells are prone to adaptive regrowth, whereby mutants with a survival advantage spontaneously appear and re-enter the cell cycle in stationary phase cultures. Adaptive regrowth is especially noticeable with short-lived strains, including those defective for SNF1, the homolog of mammalian AMP-activated protein kinase (AMPK). SNF1 becomes active in response to multiple environmental stresses that occur in chronologically aging cells, including glucose depletion and oxidative stress. SNF1 is also required for the extension of chronological lifespan (CLS) by caloric restriction (CR) as defined as limiting glucose at the time of culture inoculation. To identify specific downstream SNF1 targets responsible for CLS extension during CR, we screened for adaptive regrowth mutants that restore chronological longevity to a short-lived snf1∆ parental strain. Whole genome sequencing of the adapted mutants revealed missense mutations in TPR motifs 9 and 10 of the transcriptional co-repressor Cyc8 that specifically mediate repression through the transcriptional repressor Mig1. Another mutation occurred in MIG1 itself, thus implicating the activation of Mig1-repressed genes as a key function of SNF1 in maintaining CLS. Consistent with this conclusion, the cyc8 TPR mutations partially restored growth on alternative carbon sources and significantly extended CLS compared to the snf1∆ parent. Furthermore, cyc8 TPR mutations reactivated multiple Mig1-repressed genes, including the transcription factor gene CAT8, which is responsible for activating genes of the glyoxylate and gluconeogenesis pathways. Deleting CAT8 completely blocked CLS extension by the cyc8 TPR mutations on CLS, identifying these pathways as key Snf1-regulated CLS determinants.

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Peek, Gregory W.; Tollefsbol, Trygve O., E-mail: trygve@uab.edu; Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL

    Human telomerase reverse transcriptase (hTERT) is the catalytic and limiting component of telomerase and also a transcription factor. It is critical to the integrity of the ends of linear chromosomes and to the regulation, extent and rate of cell cycle progression in multicellular eukaryotes. The level of hTERT expression is essential to a wide range of bodily functions and to avoidance of disease conditions, such as cancer, that are mediated in part by aberrant level and regulation of cell cycle proliferation. Value of a gene in regulation depends on its ability to both receive input from multiple sources and transmitmore » signals to multiple effectors. The expression of hTERT and the progression of the cell cycle have been shown to be regulated by an extensive network of gene products and signaling pathways, including the PI3K/Akt and TGF-β pathways. The PI3K inhibitor PX-866 and the competitive estrogen receptor ligand raloxifene have been shown to modify progression of those pathways and, in combination, to decrease proliferation of estrogen receptor positive (ER+) MCF-7 breast cancer cells. We found that combinations of modulators of those pathways decreased not only hTERT transcription but also transcription of additional essential cell cycle regulators such as Cyclin D1. By evaluating known expression profile signatures for TGF-β pathway diversions, we confirmed additional genes such as heparin-binding epidermal growth factor-like growth factor (HB EGF) by which those pathways and their perturbations may also modify cell cycle progression. - Highlights: • PX-866 and raloxifene affect the PI3K/Akt and TGF-β pathways. • PX-866 and raloxifene down-regulate genes up-regulated in cancer. • PX-866 and raloxifene decrease transcription of hTERT and Cyclin D1. • Pathological transcription signatures can identify new defense mechanisms.« less

  10. Protecting DNA from errors and damage: an overview of DNA repair mechanisms in plants compared to mammals.

    PubMed

    Spampinato, Claudia P

    2017-05-01

    The genome integrity of all organisms is constantly threatened by replication errors and DNA damage arising from endogenous and exogenous sources. Such base pair anomalies must be accurately repaired to prevent mutagenesis and/or lethality. Thus, it is not surprising that cells have evolved multiple and partially overlapping DNA repair pathways to correct specific types of DNA errors and lesions. Great progress in unraveling these repair mechanisms at the molecular level has been made by several talented researchers, among them Tomas Lindahl, Aziz Sancar, and Paul Modrich, all three Nobel laureates in Chemistry for 2015. Much of this knowledge comes from studies performed in bacteria, yeast, and mammals and has impacted research in plant systems. Two plant features should be mentioned. Plants differ from higher eukaryotes in that they lack a reserve germline and cannot avoid environmental stresses. Therefore, plants have evolved different strategies to sustain genome fidelity through generations and continuous exposure to genotoxic stresses. These strategies include the presence of unique or multiple paralogous genes with partially overlapping DNA repair activities. Yet, in spite (or because) of these differences, plants, especially Arabidopsis thaliana, can be used as a model organism for functional studies. Some advantages of this model system are worth mentioning: short life cycle, availability of both homozygous and heterozygous lines for many genes, plant transformation techniques, tissue culture methods and reporter systems for gene expression and function studies. Here, I provide a current understanding of DNA repair genes in plants, with a special focus on A. thaliana. It is expected that this review will be a valuable resource for future functional studies in the DNA repair field, both in plants and animals.

  11. Boosting probabilistic graphical model inference by incorporating prior knowledge from multiple sources.

    PubMed

    Praveen, Paurush; Fröhlich, Holger

    2013-01-01

    Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.

  12. Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules.

    PubMed

    Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P

    2013-03-21

    Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso. Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.

  13. Selection and evaluation of reference genes for RT-qPCR expression studies on Burkholderia tropica strain Ppe8, a sugarcane-associated diazotrophic bacterium grown with different carbon sources or sugarcane juice.

    PubMed

    da Silva, Paula Renata Alves; Vidal, Marcia Soares; de Paula Soares, Cleiton; Polese, Valéria; Simões-Araújo, Jean Luís; Baldani, José Ivo

    2016-11-01

    Among the members of the genus Burkholderia, Burkholderia tropica has the ability to fix nitrogen and promote sugarcane plant growth as well as act as a biological control agent. There is little information about how this bacterium metabolizes carbohydrates as well as those carbon sources found in the sugarcane juice that accumulates in stems during plant growth. Reverse transcription quantitative PCR (RT-qPCR) can be used to evaluate changes in gene expression during bacterial growth on different carbon sources. Here we tested the expression of six reference genes, lpxC, gyrB, recA, rpoA, rpoB, and rpoD, when cells were grown with glucose, fructose, sucrose, mannitol, aconitic acid, and sugarcane juice as carbon sources. The lpxC, gyrB, and recA were selected as the most stable reference genes based on geNorm and NormFinder software analyses. Validation of these three reference genes during strain Ppe8 growth on the same carbon sources showed that genes involved in glycogen biosynthesis (glgA, glgB, glgC) and trehalose biosynthesis (treY and treZ) were highly expressed when Ppe8 was grown in aconitic acid relative to other carbon sources, while otsA expression (trehalose biosynthesis) was reduced with all carbon sources. In addition, the expression level of the ORF_6066 (gluconolactonase) gene was reduced on sugarcane juice. The results confirmed the stability of the three selected reference genes (lpxC, gyrB, and recA) during the RT-qPCR and also their robustness by evaluating the relative expression of genes involved in glycogen and trehalose biosynthesis when strain Ppe8 was grown on different carbon sources and sugarcane juice.

  14. The effects of shared information on semantic calculations in the gene ontology.

    PubMed

    Bible, Paul W; Sun, Hong-Wei; Morasso, Maria I; Loganantharaj, Rasiah; Wei, Lai

    2017-01-01

    The structured vocabulary that describes gene function, the gene ontology (GO), serves as a powerful tool in biological research. One application of GO in computational biology calculates semantic similarity between two concepts to make inferences about the functional similarity of genes. A class of term similarity algorithms explicitly calculates the shared information (SI) between concepts then substitutes this calculation into traditional term similarity measures such as Resnik, Lin, and Jiang-Conrath. Alternative SI approaches, when combined with ontology choice and term similarity type, lead to many gene-to-gene similarity measures. No thorough investigation has been made into the behavior, complexity, and performance of semantic methods derived from distinct SI approaches. We apply bootstrapping to compare the generalized performance of 57 gene-to-gene semantic measures across six benchmarks. Considering the number of measures, we additionally evaluate whether these methods can be leveraged through ensemble machine learning to improve prediction performance. Results showed that the choice of ontology type most strongly influenced performance across all evaluations. Combining measures into an ensemble classifier reduces cross-validation error beyond any individual measure for protein interaction prediction. This improvement resulted from information gained through the combination of ontology types as ensemble methods within each GO type offered no improvement. These results demonstrate that multiple SI measures can be leveraged for machine learning tasks such as automated gene function prediction by incorporating methods from across the ontologies. To facilitate future research in this area, we developed the GO Graph Tool Kit (GGTK), an open source C++ library with Python interface (github.com/paulbible/ggtk).

  15. Horizontal Dissemination of Antimicrobial Resistance Determinants in Multiple Salmonella Serotypes following Isolation from the Commercial Swine Operation Environment after Manure Application.

    PubMed

    Pornsukarom, Suchawan; Thakur, Siddhartha

    2017-10-15

    The aim of this study was to characterize the plasmids carrying antimicrobial resistance (AMR) determinants in multiple Salmonella serotypes recovered from the commercial swine farm environment after manure application on land. Manure and soil samples were collected on day 0 before and after manure application on six farms in North Carolina, and sequential soil samples were recollected on days 7, 14, and 21 from the same plots. All environmental samples were processed for Salmonella , and their plasmid contents were further characterized. A total of 14 isolates including Salmonella enterica serotypes Johannesburg ( n = 2), Ohio ( n = 2), Rissen ( n = 1), Typhimurium var5- ( n = 5), Worthington ( n = 3), and 4,12:i:- ( n = 1), representing different farms, were selected for plasmid analysis. Antimicrobial susceptibility testing was done by broth microdilution against a panel of 14 antimicrobials on the 14 confirmed transconjugants after conjugation assays. The plasmids were isolated by modified alkaline lysis, and PCRs were performed on purified plasmid DNA to identify the AMR determinants and the plasmid replicon types. The plasmids were sequenced for further analysis and to compare profiles and create phylogenetic trees. A class 1 integron with an ANT(2″)-Ia- aadA2 cassette was detected in the 50-kb IncN plasmids identified in S Worthington isolates. We identified 100-kb and 90-kb IncI1 plasmids in S Johannesburg and S Rissen isolates carrying the bla CMY-2 and tet (A) genes, respectively. An identical 95-kb IncF plasmid was widely disseminated among the different serotypes and across different farms. Our study provides evidence on the importance of horizontal dissemination of resistance determinants through plasmids of multiple Salmonella serotypes distributed across commercial swine farms after manure application. IMPORTANCE The horizontal gene transfer of antimicrobial resistance (AMR) determinants located on plasmids is considered to be the main reason for the rapid proliferation and spread of drug resistance. The deposition of manure generated in swine production systems into the environment is identified as a potential source of AMR dissemination. In this study, AMR gene-carrying plasmids were detected in multiple Salmonella serotypes across different commercial swine farms in North Carolina. The plasmid profiles were characterized based on Salmonella serotype donors and incompatibility (Inc) groups. We found that different Inc plasmids showed evidence of AMR gene transfer in multiple Salmonella serotypes. We detected an identical 95-kb plasmid that was widely distributed across swine farms in North Carolina. These conjugable resistance plasmids were able to persist on land after swine manure application. Our study provides strong evidence of AMR determinant dissemination present in plasmids of multiple Salmonella serotypes in the environment after manure application. Copyright © 2017 American Society for Microbiology.

  16. Stable carbon isotope fractionation of chlorinated ethenes by a microbial consortium containing multiple dechlorinating genes.

    PubMed

    Liu, Na; Ding, Longzhen; Li, Haijun; Zhang, Pengpeng; Zheng, Jixing; Weng, Chih-Huang

    2018-08-01

    The study aimed to determine the possible contribution of specific growth conditions and community structures to variable carbon enrichment factors (Ɛ- carbon ) values for the degradation of chlorinated ethenes (CEs) by a bacterial consortium with multiple dechlorinating genes. Ɛ- carbon values for trichloroethylene, cis-1,2-dichloroethylene, and vinyl chloride were -7.24% ± 0.59%, -14.6% ± 1.71%, and -21.1% ± 1.14%, respectively, during their degradation by a microbial consortium containing multiple dechlorinating genes including tceA and vcrA. The Ɛ- carbon values of all CEs were not greatly affected by changes in growth conditions and community structures, which directly or indirectly affected reductive dechlorination of CEs by this consortium. Stability analysis provided evidence that the presence of multiple dechlorinating genes within a microbial consortium had little effect on carbon isotope fractionation, as long as the genes have definite, non-overlapping functions. Copyright © 2018 Elsevier Ltd. All rights reserved.

  17. Aberrant gene promoter methylation associated with sporadic multiple colorectal cancer.

    PubMed

    Gonzalo, Victoria; Lozano, Juan José; Muñoz, Jenifer; Balaguer, Francesc; Pellisé, Maria; Rodríguez de Miguel, Cristina; Andreu, Montserrat; Jover, Rodrigo; Llor, Xavier; Giráldez, M Dolores; Ocaña, Teresa; Serradesanferm, Anna; Alonso-Espinaco, Virginia; Jimeno, Mireya; Cuatrecasas, Miriam; Sendino, Oriol; Castellví-Bel, Sergi; Castells, Antoni

    2010-01-19

    Colorectal cancer (CRC) multiplicity has been mainly related to polyposis and non-polyposis hereditary syndromes. In sporadic CRC, aberrant gene promoter methylation has been shown to play a key role in carcinogenesis, although little is known about its involvement in multiplicity. To assess the effect of methylation in tumor multiplicity in sporadic CRC, hypermethylation of key tumor suppressor genes was evaluated in patients with both multiple and solitary tumors, as a proof-of-concept of an underlying epigenetic defect. We examined a total of 47 synchronous/metachronous primary CRC from 41 patients, and 41 gender, age (5-year intervals) and tumor location-paired patients with solitary tumors. Exclusion criteria were polyposis syndromes, Lynch syndrome and inflammatory bowel disease. DNA methylation at the promoter region of the MGMT, CDKN2A, SFRP1, TMEFF2, HS3ST2 (3OST2), RASSF1A and GATA4 genes was evaluated by quantitative methylation specific PCR in both tumor and corresponding normal appearing colorectal mucosa samples. Overall, patients with multiple lesions exhibited a higher degree of methylation in tumor samples than those with solitary tumors regarding all evaluated genes. After adjusting for age and gender, binomial logistic regression analysis identified methylation of MGMT2 (OR, 1.48; 95% CI, 1.10 to 1.97; p = 0.008) and RASSF1A (OR, 2.04; 95% CI, 1.01 to 4.13; p = 0.047) as variables independently associated with tumor multiplicity, being the risk related to methylation of any of these two genes 4.57 (95% CI, 1.53 to 13.61; p = 0.006). Moreover, in six patients in whom both tumors were available, we found a correlation in the methylation levels of MGMT2 (r = 0.64, p = 0.17), SFRP1 (r = 0.83, 0.06), HPP1 (r = 0.64, p = 0.17), 3OST2 (r = 0.83, p = 0.06) and GATA4 (r = 0.6, p = 0.24). Methylation in normal appearing colorectal mucosa from patients with multiple and solitary CRC showed no relevant difference in any evaluated gene. These results provide a proof-of-concept that gene promoter methylation is associated with tumor multiplicity. This underlying epigenetic defect may have noteworthy implications in the prevention of patients with sporadic CRC.

  18. i-ADHoRe 2.0: an improved tool to detect degenerated genomic homology using genomic profiles.

    PubMed

    Simillion, Cedric; Janssens, Koen; Sterck, Lieven; Van de Peer, Yves

    2008-01-01

    i-ADHoRe is a software tool that combines gene content and gene order information of homologous genomic segments into profiles to detect highly degenerated homology relations within and between genomes. The new version offers, besides a significant increase in performance, several optimizations to the algorithm, most importantly to the profile alignment routine. As a result, the annotations of multiple genomes, or parts thereof, can be fed simultaneously into the program, after which it will report all regions of homology, both within and between genomes. The i-ADHoRe 2.0 package contains the C++ source code for the main program as well as various Perl scripts and a fully documented Perl API to facilitate post-processing. The software runs on any Linux- or -UNIX based platform. The package is freely available for academic users and can be downloaded from http://bioinformatics.psb.ugent.be/

  19. Origins and Evolution of Antibiotic Resistance

    PubMed Central

    Davies, Julian; Davies, Dorothy

    2010-01-01

    Summary: Antibiotics have always been considered one of the wonder discoveries of the 20th century. This is true, but the real wonder is the rise of antibiotic resistance in hospitals, communities, and the environment concomitant with their use. The extraordinary genetic capacities of microbes have benefitted from man's overuse of antibiotics to exploit every source of resistance genes and every means of horizontal gene transmission to develop multiple mechanisms of resistance for each and every antibiotic introduced into practice clinically, agriculturally, or otherwise. This review presents the salient aspects of antibiotic resistance development over the past half-century, with the oft-restated conclusion that it is time to act. To achieve complete restitution of therapeutic applications of antibiotics, there is a need for more information on the role of environmental microbiomes in the rise of antibiotic resistance. In particular, creative approaches to the discovery of novel antibiotics and their expedited and controlled introduction to therapy are obligatory. PMID:20805405

  20. BIG: a large-scale data integration tool for renal physiology

    PubMed Central

    Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya

    2016-01-01

    Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: “How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?” This is the type of problem that has motivated the “Big-Data” revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/. PMID:27279488

  1. A novel expression system for intracellular production and purification of recombinant affinity-tagged proteins in Aspergillus niger.

    PubMed

    Roth, Andreas H F J; Dersch, Petra

    2010-03-01

    A set of different integrative expression vectors for the intracellular production of recombinant proteins with or without affinity tag in Aspergillus niger was developed. Target genes can be expressed under the control of the highly efficient, constitutive pkiA promoter or the novel sucrose-inducible promoter of the beta-fructofuranosidase (sucA) gene of A. niger in the presence or absence of alternative carbon sources. All expression plasmids contain an identical multiple cloning sequence that allows parallel construction of N- or C-terminally His6- and StrepII-tagged versions of the target proteins. Production of two heterologous model proteins, the green fluorescence protein and the Thermobifida fusca hydrolase, proved the functionality of the vector system. Efficient production and easy detection of the target proteins as well as their fast purification by a one-step affinity chromatography, using the His6- or StrepII-tag sequence, was demonstrated.

  2. Evolutionary transitions between beneficial and phytopathogenic Rhodococcus challenge disease management

    PubMed Central

    Thomas, William J; Gordon, Michael I; Stevens, Danielle M; Creason, Allison L; Belcher, Michael S; Serdani, Maryna; Wiseman, Michele S; Grünwald, Niklaus J; Putnam, Melodie L

    2017-01-01

    Understanding how bacteria affect plant health is crucial for developing sustainable crop production systems. We coupled ecological sampling and genome sequencing to characterize the population genetic history of Rhodococcus and the distribution patterns of virulence plasmids in isolates from nurseries. Analysis of chromosome sequences shows that plants host multiple lineages of Rhodococcus, and suggested that these bacteria are transmitted due to independent introductions, reservoir populations, and point source outbreaks. We demonstrate that isolates lacking virulence genes promote beneficial plant growth, and that the acquisition of a virulence plasmid is sufficient to transition beneficial symbionts to phytopathogens. This evolutionary transition, along with the distribution patterns of plasmids, reveals the impact of horizontal gene transfer in rapidly generating new pathogenic lineages and provides an alternative explanation for pathogen transmission patterns. Results also uncovered a misdiagnosed epidemic that implicated beneficial Rhodococcus bacteria as pathogens of pistachio. The misdiagnosis perpetuated the unnecessary removal of trees and exacerbated economic losses. PMID:29231813

  3. Evolutionary transitions between beneficial and phytopathogenic Rhodococcus challenge disease management.

    PubMed

    Savory, Elizabeth A; Fuller, Skylar L; Weisberg, Alexandra J; Thomas, William J; Gordon, Michael I; Stevens, Danielle M; Creason, Allison L; Belcher, Michael S; Serdani, Maryna; Wiseman, Michele S; Grünwald, Niklaus J; Putnam, Melodie L; Chang, Jeff H

    2017-12-12

    Understanding how bacteria affect plant health is crucial for developing sustainable crop production systems. We coupled ecological sampling and genome sequencing to characterize the population genetic history of Rhodococcus and the distribution patterns of virulence plasmids in isolates from nurseries. Analysis of chromosome sequences shows that plants host multiple lineages of Rhodococcus , and suggested that these bacteria are transmitted due to independent introductions, reservoir populations, and point source outbreaks. We demonstrate that isolates lacking virulence genes promote beneficial plant growth, and that the acquisition of a virulence plasmid is sufficient to transition beneficial symbionts to phytopathogens. This evolutionary transition, along with the distribution patterns of plasmids, reveals the impact of horizontal gene transfer in rapidly generating new pathogenic lineages and provides an alternative explanation for pathogen transmission patterns. Results also uncovered a misdiagnosed epidemic that implicated beneficial Rhodococcus bacteria as pathogens of pistachio. The misdiagnosis perpetuated the unnecessary removal of trees and exacerbated economic losses.

  4. PhamDB: a web-based application for building Phamerator databases.

    PubMed

    Lamine, James G; DeJong, Randall J; Nelesen, Serita M

    2016-07-01

    PhamDB is a web application which creates databases of bacteriophage genes, grouped by gene similarity. It is backwards compatible with the existing Phamerator desktop software while providing an improved database creation workflow. Key features include a graphical user interface, validation of uploaded GenBank files, and abilities to import phages from existing databases, modify existing databases and queue multiple jobs. Source code and installation instructions for Linux, Windows and Mac OSX are freely available at https://github.com/jglamine/phage PhamDB is also distributed as a docker image which can be managed via Kitematic. This docker image contains the application and all third party software dependencies as a pre-configured system, and is freely available via the installation instructions provided. snelesen@calvin.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. Dietary Intervention by Phytochemicals and Their Role in Modulating Coding and Non-Coding Genes in Cancer

    PubMed Central

    Budisan, Liviuta; Gulei, Diana; Zanoaga, Oana Mihaela; Irimie, Alexandra Iulia; Chira, Sergiu; Braicu, Cornelia; Gherman, Claudia Diana; Berindan-Neagoe, Ioana

    2017-01-01

    Phytochemicals are natural compounds synthesized as secondary metabolites in plants, representing an important source of molecules with a wide range of therapeutic applications. These natural agents are important regulators of key pathological processes/conditions, including cancer, as they are able to modulate the expression of coding and non-coding transcripts with an oncogenic or tumour suppressor role. These natural agents are currently exploited for the development of therapeutic strategies alone or in tandem with conventional treatments for cancer. The aim of this paper is to review the recent studies regarding the role of these natural phytochemicals in different processes related to cancer inhibition, including apoptosis activation, angiogenesis and metastasis suppression. From the large palette of phytochemicals we selected epigallocatechin gallate (EGCG), caffeic acid phenethyl ester (CAPE), genistein, morin and kaempferol, due to their increased activity in modulating multiple coding and non-coding genes, targeting the main hallmarks of cancer. PMID:28587155

  6. Dietary Intervention by Phytochemicals and Their Role in Modulating Coding and Non-Coding Genes in Cancer.

    PubMed

    Budisan, Liviuta; Gulei, Diana; Zanoaga, Oana Mihaela; Irimie, Alexandra Iulia; Sergiu, Chira; Braicu, Cornelia; Gherman, Claudia Diana; Berindan-Neagoe, Ioana

    2017-06-01

    Phytochemicals are natural compounds synthesized as secondary metabolites in plants, representing an important source of molecules with a wide range of therapeutic applications. These natural agents are important regulators of key pathological processes/conditions, including cancer, as they are able to modulate the expression of coding and non-coding transcripts with an oncogenic or tumour suppressor role. These natural agents are currently exploited for the development of therapeutic strategies alone or in tandem with conventional treatments for cancer. The aim of this paper is to review the recent studies regarding the role of these natural phytochemicals in different processes related to cancer inhibition, including apoptosis activation, angiogenesis and metastasis suppression. From the large palette of phytochemicals we selected epigallocatechin gallate (EGCG), caffeic acid phenethyl ester (CAPE), genistein, morin and kaempferol, due to their increased activity in modulating multiple coding and non-coding genes, targeting the main hallmarks of cancer.

  7. Self-excising Cre/mutant lox marker recycling system for multiple gene integrations and consecutive gene deletions in Aspergillus oryzae.

    PubMed

    Zhang, Silai; Ban, Akihiko; Ebara, Naoki; Mizutani, Osamu; Tanaka, Mizuki; Shintani, Takahiro; Gomi, Katsuya

    2017-04-01

    In this study, we developed a self-excising Cre/loxP-mediated marker recycling system with mutated lox sequences to introduce a number of biosynthetic genes into Aspergillus oryzae. To construct the self-excising marker cassette, both the selectable marker, the Aspergillus nidulans adeA gene, and the Cre recombinase gene (cre), conditionally expressed by the xylanase-encoding gene promoter, were designed to be located between the mutant lox sequences, lox66 and lox71. However, construction of the plasmid failed, possibly owing to a slight expression of cre downstream of the fungal gene promoter in Escherichia coli. Hence, to avoid the excision of the cassette in E. coli, a 71-bp intron of the A. oryzae xynG2 gene was inserted into the cre gene. The A. oryzae adeA deletion mutant was transformed with the resulting plasmid in the presence of glucose, and the transformants were cultured in medium containing xylose as the sole carbon source. PCR analysis of genomic DNA from resultant colonies revealed the excision of both the marker and Cre expression construct, indicating that the self-excising marker cassette was efficient at removing the selectable marker. Using the marker recycling system, hyperproduction of kojic acid could be achieved in A. oryzae by the introduction of two genes that encode oxidoreductase and transporter. Furthermore, we also constructed an alternative marker recycling cassette bearing the A. nidulans pyrithiamine resistant gene (ptrA) as a dominant selectable marker. Copyright © 2017 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  8. Interleukin 35 and Hepatocyte Growth Factor; as a novel combined immune gene therapy for Multiple Sclerosis disease.

    PubMed

    Moghadam, Samira; Erfanmanesh, Maryam; Esmaeilzadeh, Abdolreza

    2017-11-01

    An autoimmune demyelination disease of the Central Nervous System, Multiple Sclerosis, is a chronic inflammation which mostly involves young adults. Suffering people face functional loss with a severe pain. Most current MS treatments are focused on the immune response suppression. Approved drugs suppress the inflammatory process, but factually, there is no definite cure for Multiple Sclerosis. Recently developed knowledge has demonstrated that gene and cell therapy as a hopeful approach in tissue regeneration. The authors propose a novel combined immune gene therapy for Multiple Sclerosis treatment using anti-inflammatory and remyelination of Interleukine-35 and Hepatocyte Growth Factor properties, respectively. In this hypothesis Interleukine-35 and Hepatocyte Growth Factor introduce to Mesenchymal Stem Cells of EAE mouse model via an adenovirus based vector. It is expected that Interleukine-35 and Hepatocyte Growth Factor genes expressed from MSCs could effectively perform in immunotherapy of Multiple Sclerosis. Copyright © 2017. Published by Elsevier Ltd.

  9. [Gene deletion and functional analysis of the heptyl glycosyltransferase (waaF) gene in Vibrio parahemolyticus O-antigen cluster].

    PubMed

    Zhao, Feng; Meng, Songsong; Zhou, Deqing

    2016-02-04

    To construct heptyl glycosyltransferase gene II (waaF) gene deletion mutant of Vibrio parahaemolyticus, and explore the function of the waaF gene in Vibrio parahaemolyticus. The waaF gene deletion mutant was constructed by chitin-based transformation technology using clinical isolates, and then the growth rate, morphology and serotypes were identified. The different sources (O3, O5 and O10) waaF gene complementations were constructed through E. coli S17λpir strains conjugative transferring with Vibrio parahaemolyticus, and the function of the waaF gene was further verified by serotypes. The waaF gene deletion mutant strain was successfully constructed and it grew normally. The growth rate and morphology of mutant were similar with the wild type strains (WT), but the mutant could not occurred agglutination reaction with O antisera. The O3 and O5 sources waaF gene complementations occurred agglutination reaction with O antisera, but the O10 sources waaF gene complementations was not. The waaF gene was related with O-antigen synthesis and it was the key gene of O-antigen synthesis pathway in Vibrio parahaemolyticus. The function of different sources waaF gene were not the same.

  10. Comparative genomics of duplicate γ-glutamyl transferase genes in teleosts: medaka (Oryzias latipes), stickleback (Gasterosteus aculeatus), green spotted pufferfish (Tetraodon nigroviridis), fugu (Takifugu rubripes), and zebrafish (Danio rerio).

    PubMed

    Law, Sheran Hiu Wan; Redelings, Benjamin David; Kullman, Seth William

    2012-01-15

    The availability of multiple teleost (bony fish) genomes is providing unprecedented opportunities to understand the diversity and function of gene duplication events using comparative genomics. Here we examine multiple paralogous genes of γ-glutamyl transferase (GGT) in several distantly related teleost species including medaka, stickleback, green spotted pufferfish, fugu, and zebrafish. Through mining genome databases, we have identified multiple GGT orthologs. Duplicate (paralogous) GGT sequences for GGT1 (GGT1 a and b), GGTL1 (GGTL1 a and b), and GGTL3 (GGTL3 a and b) were identified for each species. Phylogenetic analysis suggests that GGTs are ancient proteins conserved across most metazoan phyla and those paralogous GGTs in teleosts likely arose from the serial 3R genome duplication events. A third GGTL1 gene (GGTL1c) was found in green spotted pufferfish; however, this gene is not present in medaka, stickleback, or fugu. Similarly, one or both paralogs of GGTL3 appear to have been lost in green spotted pufferfish, fugu, and zebrafish. Syntenic relationships were highly maintained between duplicated teleost chromosomes, among teleosts and across ray-finned (Actinopterygii) and lobe-finned (Sarcopterygii) species. To assess subfunction partitioning, six medaka GGT genes were cloned and assessed for developmental and tissue-specific expression. On the basis of these data, we propose a modification of the "duplication-degeneration-complementation" model of subfunction partitioning where quantitative differences rather than absolute differences in gene expression are observed between gene paralogs. Our results demonstrate that multiple GGT genes have been retained within teleost genomes. Questions remain, however, regarding the functional roles of multiple GGTs in these species. Copyright © 2011 Wiley Periodicals, Inc., A Wiley Company.

  11. Enabling a systems biology knowledgebase with gaggle and firegoose

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baliga, Nitin S.

    The overall goal of this project was to extend the existing Gaggle and Firegoose systems to develop an open-source technology that runs over the web and links desktop applications with many databases and software applications. This technology would enable researchers to incorporate workflows for data analysis that can be executed from this interface to other online applications. The four specific aims were to (1) provide one-click mapping of genes, proteins, and complexes across databases and species; (2) enable multiple simultaneous workflows; (3) expand sophisticated data analysis for online resources; and enhance open-source development of the Gaggle-Firegoose infrastructure. Gaggle is anmore » open-source Java software system that integrates existing bioinformatics programs and data sources into a user-friendly, extensible environment to allow interactive exploration, visualization, and analysis of systems biology data. Firegoose is an extension to the Mozilla Firefox web browser that enables data transfer between websites and desktop tools including Gaggle. In the last phase of this funding period, we have made substantial progress on development and application of the Gaggle integration framework. We implemented the workspace to the Network Portal. Users can capture data from Firegoose and save them to the workspace. Users can create workflows to start multiple software components programmatically and pass data between them. Results of analysis can be saved to the cloud so that they can be easily restored on any machine. We also developed the Gaggle Chrome Goose, a plugin for the Google Chrome browser in tandem with an opencpu server in the Amazon EC2 cloud. This allows users to interactively perform data analysis on a single web page using the R packages deployed on the opencpu server. The cloud-based framework facilitates collaboration between researchers from multiple organizations. We have made a number of enhancements to the cmonkey2 application to enable and improve the integration within different environments, and we have created a new tools pipeline for generating EGRIN2 models in a largely automated way.« less

  12. Usefulness of Housekeeping Genes for the Diagnosis of Helicobacter pylori Infection, Strain Discrimination and Detection of Multiple Infection.

    PubMed

    Palau, Montserrat; Kulmann, Marcos; Ramírez-Lázaro, María José; Lario, Sergio; Quilez, María Elisa; Campo, Rafael; Piqué, Núria; Calvet, Xavier; Miñana-Galbis, David

    2016-12-01

    Helicobacter pylori infects human stomachs of over half the world's population, evades the immune response and establishes a chronic infection. Although most people remains asymptomatic, duodenal and gastric ulcers, MALT lymphoma and progression to gastric cancer could be developed. Several virulence factors such as flagella, lipopolysaccharide, adhesins and especially the vacuolating cytotoxin VacA and the oncoprotein CagA have been described for H. pylori. Despite the extensive published data on H. pylori, more research is needed to determine new virulence markers, the exact mode of transmission or the role of multiple infection. Amplification and sequencing of six housekeeping genes (amiA, cgt, cpn60, cpn70, dnaJ, and luxS) related to H. pylori pathogenesis have been performed in order to evaluate their usefulness for the specific detection of H. pylori, the genetic discrimination at strain level and the detection of multiple infection. A total of 52 H. pylori clones, isolated from 14 gastric biopsies from 11 patients, were analyzed for this purpose. All genes were specifically amplified for H. pylori and all clones isolated from different patients were discriminated, with gene distances ranged from 0.9 to 7.8%. Although most clones isolated from the same patient showed identical gene sequences, an event of multiple infection was detected in all the genes and microevolution events were showed for amiA and cpn60 genes. These results suggested that housekeeping genes could be useful for H. pylori detection and to elucidate the mode of transmission and the relevance of the multiple infection. © 2016 John Wiley & Sons Ltd.

  13. Stressing "Escherichia coli" to Educate Students about Research: A CURE to Investigate Multiple Levels of Gene Regulation

    ERIC Educational Resources Information Center

    McDonough, Janet; Goudsouzian, Lara K.; Papaj, Agllai; Maceli, Ashley R.; Klepac-Ceraj, Vanja; Peterson, Celeste N.

    2017-01-01

    Course-based undergraduate research experiences (CUREs) have been shown to increase student retention and learning in the biological sciences. Most CURES cover only one aspect of gene regulation, such as transcriptional control. Here we present a new inquiry-based lab that engages understanding of gene expression from multiple perspectives.…

  14. Germline mutations in candidate predisposition genes in individuals with cutaneous melanoma and at least two independent additional primary cancers.

    PubMed

    Pritchard, Antonia L; Johansson, Peter A; Nathan, Vaishnavi; Howlie, Madeleine; Symmons, Judith; Palmer, Jane M; Hayward, Nicholas K

    2018-01-01

    While a number of autosomal dominant and autosomal recessive cancer syndromes have an associated spectrum of cancers, the prevalence and variety of cancer predisposition mutations in patients with multiple primary cancers have not been extensively investigated. An understanding of the variants predisposing to more than one cancer type could improve patient care, including screening and genetic counselling, as well as advancing the understanding of tumour development. A cohort of 57 patients ascertained due to their cutaneous melanoma (CM) diagnosis and with a history of two or more additional non-cutaneous independent primary cancer types were recruited for this study. Patient blood samples were assessed by whole exome or whole genome sequencing. We focussed on variants in 525 pre-selected genes, including 65 autosomal dominant and 31 autosomal recessive cancer predisposition genes, 116 genes involved in the DNA repair pathway, and 313 commonly somatically mutated in cancer. The same genes were analysed in exome sequence data from 1358 control individuals collected as part of non-cancer studies (UK10K). The identified variants were classified for pathogenicity using online databases, literature and in silico prediction tools. No known pathogenic autosomal dominant or previously described compound heterozygous mutations in autosomal recessive genes were observed in the multiple cancer cohort. Variants typically found somatically in haematological malignancies (in JAK1, JAK2, SF3B1, SRSF2, TET2 and TYK2) were present in lymphocyte DNA of patients with multiple primary cancers, all of whom had a history of haematological malignancy and cutaneous melanoma, as well as colorectal cancer and/or prostate cancer. Other potentially pathogenic variants were discovered in BUB1B, POLE2, ROS1 and DNMT3A. Compared to controls, multiple cancer cases had significantly more likely damaging mutations (nonsense, frameshift ins/del) in tumour suppressor and tyrosine kinase genes and higher overall burden of mutations in all cancer genes. We identified several pathogenic variants that likely predispose to at least one of the tumours in patients with multiple cancers. We additionally present evidence that there may be a higher burden of variants of unknown significance in 'cancer genes' in patients with multiple cancer types. Further screens of this nature need to be carried out to build evidence to show if the cancers observed in these patients form part of a cancer spectrum associated with single germline variants in these genes, whether multiple layers of susceptibility exist (oligogenic or polygenic), or if the occurrence of multiple different cancers is due to random chance.

  15. dbCPG: A web resource for cancer predisposition genes.

    PubMed

    Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng

    2016-06-21

    Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes.

  16. Molecular epidemiology and phylogenetic distribution of the Escherichia coli pks genomic island.

    PubMed

    Johnson, James R; Johnston, Brian; Kuskowski, Michael A; Nougayrede, Jean-Philippe; Oswald, Eric

    2008-12-01

    Epidemiological and phylogenetic associations of the pks genomic island of extraintestinal pathogenic Escherichia coli (ExPEC), which encodes the genotoxin colibactin, are incompletely defined. clbB and clbN (as markers for the 5' and 3' regions of the pks island, respectively), clbA and clbQ (as supplemental pks island markers), and 12 other putative ExPEC virulence genes were newly sought by PCR among 131 published E. coli isolates from hospitalized veterans (62 blood isolates and 69 fecal isolates). Blood and fecal isolates and clbB-positive and -negative isolates were compared for 66 newly and previously assessed traits. Among the 14 newly sought traits, clbB and clbN (colibactin polyketide synthesis system), hra (heat-resistant agglutinin), and vat (vacuolating toxin) were significantly associated with bacteremia. clbB and clbN identified a subset within phylogenetic group B2 with extremely high virulence scores and a high proportion of blood isolates. However, by multivariable analysis, other traits were more predictive of blood source than clbB and clbN were; indeed, among the newly sought traits, only pic significantly predicted bacteremia (negative association). By correspondence analysis, clbB and clbN were closely associated with group B2 and multiple B2-associated traits; by principal coordinate analysis, clbB and clbN partitioned the data set better than did blood versus fecal source. Thus, the pks island was significantly associated with bacteremia, multiple ExPEC-associated virulence genes, and group B2, and within group B2, it identified an especially high-virulence subset. This extends previous work regarding the pks island and supports investigation of the colibactin system as a potential therapeutic target.

  17. Differential diagnosis of Mendelian and mitochondrial disorders in patients with suspected multiple sclerosis

    PubMed Central

    Katz Sand, Ilana B.; Honce, Justin M.; Lublin, Fred D.

    2015-01-01

    Several single gene disorders share clinical and radiologic characteristics with multiple sclerosis and have the potential to be overlooked in the differential diagnostic evaluation of both adult and paediatric patients with multiple sclerosis. This group includes lysosomal storage disorders, various mitochondrial diseases, other neurometabolic disorders, and several other miscellaneous disorders. Recognition of a single-gene disorder as causal for a patient’s ‘multiple sclerosis-like’ phenotype is critically important for accurate direction of patient management, and evokes broader genetic counselling implications for affected families. Here we review single gene disorders that have the potential to mimic multiple sclerosis, provide an overview of clinical and investigational characteristics of each disorder, and present guidelines for when clinicians should suspect an underlying heritable disorder that requires diagnostic confirmation in a patient with a definite or probable diagnosis of multiple sclerosis. PMID:25636970

  18. Methods for simultaneous control of lignin content and composition, and cellulose content in plants

    DOEpatents

    Chiang, Vincent Lee C.; Li, Laigeng

    2005-02-15

    The present invention relates to a method of concurrently introducing multiple genes into plants and trees is provided. The method includes simultaneous transformation of plants with multiple genes from the phenylpropanoid pathways including 4CL, CAld5H, AldOMT, SAD and CAD genes and combinations thereof to produce various lines of transgenic plants displaying altered agronomic traits. The agronomic traits of the plants are regulated by the orientation of the specific genes and the selected gene combinations, which are incorporated into the plant genome.

  19. 46 CFR 111.10-5 - Multiple energy sources.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 46 Shipping 4 2010-10-01 2010-10-01 false Multiple energy sources. 111.10-5 Section 111.10-5...-GENERAL REQUIREMENTS Power Supply § 111.10-5 Multiple energy sources. Failure of any single generating set energy source such as a boiler, diesel, gas turbine, or steam turbine must not cause all generating sets...

  20. 46 CFR 111.10-5 - Multiple energy sources.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 46 Shipping 4 2011-10-01 2011-10-01 false Multiple energy sources. 111.10-5 Section 111.10-5...-GENERAL REQUIREMENTS Power Supply § 111.10-5 Multiple energy sources. Failure of any single generating set energy source such as a boiler, diesel, gas turbine, or steam turbine must not cause all generating sets...

  1. 46 CFR 111.10-5 - Multiple energy sources.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 46 Shipping 4 2013-10-01 2013-10-01 false Multiple energy sources. 111.10-5 Section 111.10-5...-GENERAL REQUIREMENTS Power Supply § 111.10-5 Multiple energy sources. Failure of any single generating set energy source such as a boiler, diesel, gas turbine, or steam turbine must not cause all generating sets...

  2. 46 CFR 111.10-5 - Multiple energy sources.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 46 Shipping 4 2014-10-01 2014-10-01 false Multiple energy sources. 111.10-5 Section 111.10-5...-GENERAL REQUIREMENTS Power Supply § 111.10-5 Multiple energy sources. Failure of any single generating set energy source such as a boiler, diesel, gas turbine, or steam turbine must not cause all generating sets...

  3. 46 CFR 111.10-5 - Multiple energy sources.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 46 Shipping 4 2012-10-01 2012-10-01 false Multiple energy sources. 111.10-5 Section 111.10-5...-GENERAL REQUIREMENTS Power Supply § 111.10-5 Multiple energy sources. Failure of any single generating set energy source such as a boiler, diesel, gas turbine, or steam turbine must not cause all generating sets...

  4. Whole gene expression profile in blood reveals multiple pathways deregulation in R6/2 mouse model

    PubMed Central

    2013-01-01

    Background Huntington Disease (HD) is a progressive neurological disorder, with pathological manifestations in brain areas and in periphery caused by the ubiquitous expression of mutant Huntingtin protein. Transcriptional dysregulation is considered a key molecular mechanism responsible of HD pathogenesis but, although numerous studies investigated mRNA alterations in HD, so far none evaluated a whole gene expression profile in blood of R6/2 mouse model. Findings To discover novel pathogenic mechanisms and potential peripheral biomarkers useful to monitor disease progression or drug efficacy, a microarray study was performed in blood of R6/2 at manifest stage and wild type littermate mice. This approach allowed to propose new peripheral molecular processes involved in HD and to suggest different panels of candidate biomarkers. Among the discovered deregulated processes, we focused on specific ones: complement and coagulation cascades, PPAR signaling, cardiac muscle contraction, and dilated cardiomyopathy pathways. Selected genes derived from these pathways were additionally investigated in other accessible tissues to validate these matrices as source of biomarkers, and in brain, to link central and peripheral disease manifestations. Conclusions Our findings validated the skeletal muscle as suitable source to investigate peripheral transcriptional alterations in HD and supported the hypothesis that immunological alteration may contribute to neurological degeneration. Moreover, the identification of altered signaling in mouse blood enforce R6/2 transgenic mouse as a powerful HD model while suggesting novel disease biomarkers for pre-clinical investigation. PMID:24252798

  5. Precipitation of alacranite (As8S9) by a novel As(V)-respiring anaerobe strain MPA-C3.

    PubMed

    Mumford, Adam C; Yee, Nathan; Young, Lily Y

    2013-10-01

    Strain MPA-C3 was isolated by incubating arsenic-bearing sediments under anaerobic, mesophilic conditions in minimal media with acetate as the sole source of energy and carbon, and As(V) as the sole electron acceptor. Following growth and the respiratory reduction of As(V) to As(III), a yellow precipitate formed in active cultures, while no precipitate was observed in autoclaved controls, or in uninoculated media supplemented with As(III). The precipitate was identified by X-ray diffraction as alacranite, As8 S9 , a mineral previously only identified in hydrothermal environments. Sequencing of the 16S rRNA gene indicated that strain MPA-C3 is a member of the Deferribacteres family, with relatively low (90%) identity to Denitrovibrio acetiphilus DSM 12809. The arsenate respiratory reductase gene, arrA, was sequenced, showing high homology to the arrA gene of Desulfitobacterium halfniense. In addition to As(V), strain MPA-C3 utilizes NO3(-), Se(VI), Se(IV), fumarate and Fe(III) as electron acceptors, and acetate, pyruvate, fructose and benzoate as sources of carbon and energy. Analysis of a draft genome sequence revealed multiple pathways for respiration and carbon utilization. The results of this work demonstrate that alacranite, a mineral previously thought to be formed only chemically under hydrothermal conditions, is precipitated under mesophilic conditions by the metabolically versatile strain MPA-C3. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  6. Multiple introductions of a reassortant H5N1 avian influenza virus of clade 2.3.2.1c with PB2 gene of H9N2 subtype into Indian poultry.

    PubMed

    Tosh, Chakradhar; Nagarajan, Shanmugasundaram; Kumar, Manoj; Murugkar, Harshad V; Venkatesh, Govindarajulu; Shukla, Shweta; Mishra, Amit; Mishra, Pranav; Agarwal, Sonam; Singh, Bharati; Dubey, Prashant; Tripathi, Sushil; Kulkarni, Diwakar D

    2016-09-01

    Highly pathogenic avian influenza (HPAI) H5N1 viruses are a threat to poultry in Asia, Europe, Africa and North America. Here, we report isolation and characterization of H5N1 viruses isolated from ducks and turkeys in Kerala, Chandigarh and Uttar Pradesh, India between November 2014 and March 2015. Genetic and phylogenetic analyses of haemagglutinin gene identified that the virus belonged to a new clade 2.3.2.1c which has not been detected earlier in Indian poultry. The virus possessed molecular signature for high pathogenicity to chickens, which was corroborated by intravenous pathogenicity index of 2.96. The virus was a reassortant which derives its PB2 gene from H9N2 virus isolated in China during 2007-2013. However, the neuraminidase and internal genes are of H5N1 subtype. Phylogenetic and network analysis revealed that after detection in China in 2013/2014, the virus moved to Europe, West Africa and other Asian countries including India. The analyses further indicated multiple introductions of H5N1 virus in Indian poultry and internal spread in Kerala. One of the outbreaks in ducks in Kerala is linked to the H5N1 virus isolated from wild birds in Dubai suggesting movement of virus probably through migration of wild birds. However, the outbreaks in ducks in Chandigarh and Uttar Pradesh were from an unknown source in Asia which also contributed gene pools to the outbreaks in Europe and West Africa. The widespread incidence of the novel H5N1 HPAI is similar to the spread of clade 2.2 ("Qinghai-like") virus in 2005, and should be monitored to avoid threat to animal and public health. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Tumor suppressor NDRG2 inhibits glycolysis and glutaminolysis in colorectal cancer cells by repressing c-Myc expression

    PubMed Central

    Chu, Dake; Wei, Li; Li, Xia; Yang, Guodong; Liu, Xinping; Yao, Libo; Zhang, Jian; Shen, Lan

    2015-01-01

    Cancer cells use glucose and glutamine as the major sources of energy and precursor intermediates, and enhanced glycolysis and glutamimolysis are the major hallmarks of metabolic reprogramming in cancer. Oncogene activation and tumor suppressor gene inactivation alter multiple intracellular signaling pathways that affect glycolysis and glutaminolysis. N-Myc downstream regulated gene 2 (NDRG2) is a tumor suppressor gene inhibiting cancer growth, metastasis and invasion. However, the role and molecular mechanism of NDRG2 in cancer metabolism remains unclear. In this study, we discovered the role of the tumor suppressor gene NDRG2 in aerobic glycolysis and glutaminolysis of cancer cells. NDRG2 inhibited glucose consumption and lactate production, glutamine consumption and glutamate production in colorectal cancer cells. Analysis of glucose transporters and the catalytic enzymes involved in glycolysis revealed that glucose transporter 1 (GLUT1), hexokinase 2 (HK2), pyruvate kinase M2 isoform (PKM2) and lactate dehydrogenase A (LDHA) was significantly suppressed by NDRG2. Analysis of glutamine transporter and the catalytic enzymes involved in glutaminolysis revealed that glutamine transporter ASC amino-acid transporter 2 (ASCT2) and glutaminase 1 (GLS1) was also significantly suppressed by NDRG2. Transcription factor c-Myc mediated inhibition of glycolysis and glutaminolysis by NDRG2. More importantly, NDRG2 inhibited the expression of c-Myc by suppressing the expression of β-catenin, which can transcriptionally activate C-MYC gene in nucleus. In addition, the growth and proliferation of colorectal cancer cells were suppressed significantly by NDRG2 through inhibition of glycolysis and glutaminolysis. Taken together, these findings indicate that NDRG2 functions as an essential regulator in glycolysis and glutaminolysis via repression of c-Myc, and acts as a suppressor of carcinogenesis through coordinately targeting glucose and glutamine transporter, multiple catalytic enzymes involved in glycolysis and glutaminolysis, which fuels the bioenergy and biomaterials needed for cancer proliferation and progress. PMID:26317652

  8. Gene transcription in sea otters (Enhydra lutris); development of a diagnostic tool for sea otter and ecosystem health

    USGS Publications Warehouse

    Bowen, Lizabeth; Miles, A. Keith; Murray, Michael; Haulena, Martin; Tuttle, Judy; van Bonn, William; Adams, Lance; Bodkin, James L.; Ballachey, Brenda E.; Estes, James A.; Tinker, M. Tim; Keister, Robin; Stott, Jeffrey L.

    2012-01-01

    Gene transcription analysis for diagnosing or monitoring wildlife health requires the ability to distinguish pathophysiological change from natural variation. Herein, we describe methodology for the development of quantitative real-time polymerase chain reaction (qPCR) assays to measure differential transcript levels of multiple immune function genes in the sea otter (Enhydra lutris); sea otter-specific qPCR primer sequences for the genes of interest are defined. We establish a ‘reference’ range of transcripts for each gene in a group of clinically healthy captive and free-ranging sea otters. The 10 genes of interest represent multiple physiological systems that play a role in immuno-modulation, inflammation, cell protection, tumour suppression, cellular stress response, xenobiotic metabolizing enzymes, antioxidant enzymes and cell–cell adhesion. The cycle threshold (CT) measures for most genes were normally distributed; the complement cytolysis inhibitor was the exception. The relative enumeration of multiple gene transcripts in simple peripheral blood samples expands the diagnostic capability currently available to assess the health of sea otters in situ and provides a better understanding of the state of their environment.

  9. Color-deficient cone mosaics associated with Xq28 opsin mutations: A stop codon versus gene deletions

    PubMed Central

    Wagner-Schuman, Melissa; Neitz, Jay; Rha, Jungtae; Williams, David R.; Neitz, Maureen; Carroll, Joseph

    2010-01-01

    Our understanding of the etiology of red-green color vision defects is evolving. While missense mutations within the long- (L-) and middle-wavelength sensitive (M-) photopigments and gross rearrangements within the L/M-opsin gene array are commonly associated with red-green defects, recent work using adaptive optics retinal imaging has shown that different genotypes can have distinct consequences for the cone mosaic. Here we examined the cone mosaic in red-green color deficient individuals with multiple X-chromosome opsin genes that encode L opsin, as well as individuals with a single X-chromosome opsin gene that encodes L opsin and a single patient with a novel premature termination codon in his M-opsin gene and a normal L-opsin gene. We observed no difference in cone density between normal trichomats and multiple or single gene dichromats. In addition, we demonstrate different phenotypic effects of a nonsense mutation versus the previously described deleterious polymorphism, (LIAVA), both of which differ from multiple and single gene dichromats. Our results help refine the relationship between opsin genotype and cone photoreceptor mosaic phenotype. PMID:20854834

  10. Effect of CO2 on NADH production of denitrifying microbes via inhibiting carbon source transport and its metabolism.

    PubMed

    Wan, Rui; Chen, Yinguang; Zheng, Xiong; Su, Yinglong; Huang, Haining

    2018-06-15

    The potential effect of CO 2 on environmental microbes has drawn much attention recently. As an important section of the nitrogen cycle, biological denitrification requires electron donor to reduce nitrogen oxide. Nicotinamide adenine dinucleotide (NADH), which is formed during carbon source metabolism, is a widely reported electron donor for denitrification. Here we studied the effect of CO 2 on NADH production and carbon source utilization in the denitrifying microbe Paracoccus denitrificans. We observed that NADH level was decreased by 45.5% with the increase of CO 2 concentration from 0 to 30,000ppm, which was attributed to the significantly decreased utilization of carbon source (i.e., acetate). Further study showed that CO 2 inhibited carbon source utilization because of multiple negative influences: (1) suppressing the growth and viability of denitrifier cells, (2) weakening the driving force for carbon source transport by decreasing bacterial membrane potential, and (3) downregulating the expression of genes encoding key enzymes involved in intracellular carbon metabolism, such as citrate synthase, aconitate hydratase, isocitrate dehydrogenase, succinate dehydrogenase, and fumarate reductase. This study suggests that the inhibitory effect of CO 2 on NADH production in denitrifiers might deteriorate the denitrification performance in an elevated CO 2 climate scenario. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Homology-integrated CRISPR-Cas (HI-CRISPR) system for one-step multigene disruption in Saccharomyces cerevisiae.

    PubMed

    Bao, Zehua; Xiao, Han; Liang, Jing; Zhang, Lu; Xiong, Xiong; Sun, Ning; Si, Tong; Zhao, Huimin

    2015-05-15

    One-step multiple gene disruption in the model organism Saccharomyces cerevisiae is a highly useful tool for both basic and applied research, but it remains a challenge. Here, we report a rapid, efficient, and potentially scalable strategy based on the type II Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR associated proteins (Cas) system to generate multiple gene disruptions simultaneously in S. cerevisiae. A 100 bp dsDNA mutagenizing homologous recombination donor is inserted between two direct repeats for each target gene in a CRISPR array consisting of multiple donor and guide sequence pairs. An ultrahigh copy number plasmid carrying iCas9, a variant of wild-type Cas9, trans-encoded RNA (tracrRNA), and a homology-integrated crRNA cassette is designed to greatly increase the gene disruption efficiency. As proof of concept, three genes, CAN1, ADE2, and LYP1, were simultaneously disrupted in 4 days with an efficiency ranging from 27 to 87%. Another three genes involved in an artificial hydrocortisone biosynthetic pathway, ATF2, GCY1, and YPR1, were simultaneously disrupted in 6 days with 100% efficiency. This homology-integrated CRISPR (HI-CRISPR) strategy represents a powerful tool for creating yeast strains with multiple gene knockouts.

  12. Rice DB: an Oryza Information Portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis

    PubMed Central

    Narsai, Reena; Devenish, James; Castleden, Ian; Narsai, Kabir; Xu, Lin; Shou, Huixia; Whelan, James

    2013-01-01

    Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or ‘expressology’, thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au). PMID:24147765

  13. Rice DB: an Oryza Information Portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis.

    PubMed

    Narsai, Reena; Devenish, James; Castleden, Ian; Narsai, Kabir; Xu, Lin; Shou, Huixia; Whelan, James

    2013-12-01

    Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or 'expressology', thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au). © 2013 The Authors The Plant Journal © 2013 John Wiley & Sons Ltd.

  14. The detection and phylogenetic analysis of the alkane 1-monooxygenase gene of members of the genus Rhodococcus.

    PubMed

    Táncsics, András; Benedek, Tibor; Szoboszlay, Sándor; Veres, Péter G; Farkas, Milán; Máthé, István; Márialigeti, Károly; Kukolya, József; Lányi, Szabolcs; Kriszt, Balázs

    2015-02-01

    Naturally occurring and anthropogenic petroleum hydrocarbons are potential carbon sources for many bacteria. The AlkB-related alkane hydroxylases, which are integral membrane non-heme iron enzymes, play a key role in the microbial degradation of many of these hydrocarbons. Several members of the genus Rhodococcus are well-known alkane degraders and are known to harbor multiple alkB genes encoding for different alkane 1-monooxygenases. In the present study, 48 Rhodococcus strains, representing 35 species of the genus, were investigated to find out whether there was a dominant type of alkB gene widespread among species of the genus that could be used as a phylogenetic marker. Phylogenetic analysis of rhodococcal alkB gene sequences indicated that a certain type of alkB gene was present in almost every member of the genus Rhodococcus. These alkB genes were common in a unique nucleotide sequence stretch absent from other types of rhodococcal alkB genes that encoded a conserved amino acid motif: WLG(I/V/L)D(G/D)GL. The sequence identity of the targeted alkB gene in Rhodococcus ranged from 78.5 to 99.2% and showed higher nucleotide sequence variation at the inter-species level compared to the 16S rRNA gene (93.9-99.8%). The results indicated that the alkB gene type investigated might be applicable for: (i) differentiating closely related Rhodococcus species, (ii) properly assigning environmental isolates to existing Rhodococcus species, and finally (iii) assessing whether a new Rhodococcus isolate represents a novel species of the genus. Copyright © 2014 Elsevier GmbH. All rights reserved.

  15. dbWFA: a web-based database for functional annotation of Triticum aestivum transcripts

    PubMed Central

    Vincent, Jonathan; Dai, Zhanwu; Ravel, Catherine; Choulet, Frédéric; Mouzeyar, Said; Bouzidi, M. Fouad; Agier, Marie; Martre, Pierre

    2013-01-01

    The functional annotation of genes based on sequence homology with genes from model species genomes is time-consuming because it is necessary to mine several unrelated databases. The aim of the present work was to develop a functional annotation database for common wheat Triticum aestivum (L.). The database, named dbWFA, is based on the reference NCBI UniGene set, an expressed gene catalogue built by expressed sequence tag clustering, and on full-length coding sequences retrieved from the TriFLDB database. Information from good-quality heterogeneous sources, including annotations for model plant species Arabidopsis thaliana (L.) Heynh. and Oryza sativa L., was gathered and linked to T. aestivum sequences through BLAST-based homology searches. Even though the complexity of the transcriptome cannot yet be fully appreciated, we developed a tool to easily and promptly obtain information from multiple functional annotation systems (Gene Ontology, MapMan bin codes, MIPS Functional Categories, PlantCyc pathway reactions and TAIR gene families). The use of dbWFA is illustrated here with several query examples. We were able to assign a putative function to 45% of the UniGenes and 81% of the full-length coding sequences from TriFLDB. Moreover, comparison of the annotation of the whole T. aestivum UniGene set along with curated annotations of the two model species assessed the accuracy of the annotation provided by dbWFA. To further illustrate the use of dbWFA, genes specifically expressed during the early cell division or late storage polymer accumulation phases of T. aestivum grain development were identified using a clustering analysis and then annotated using dbWFA. The annotation of these two sets of genes was consistent with previous analyses of T. aestivum grain transcriptomes and proteomes. Database URL: urgi.versailles.inra.fr/dbWFA/ PMID:23660284

  16. MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers

    PubMed Central

    Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier

    2017-01-01

    Background The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. Objective MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. Methods MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. Results MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user’s specific interests and provides an efficient way to share information with collaborators. Furthermore, the user’s behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. Conclusions We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. PMID:28623182

  17. Combining lipophilic dye, in situ hybridization, immunohistochemistry, and histology.

    PubMed

    Duncan, Jeremy; Kersigo, Jennifer; Gray, Brian; Fritzsch, Bernd

    2011-03-17

    Going beyond single gene function to cut deeper into gene regulatory networks requires multiple mutations combined in a single animal. Such analysis of two or more genes needs to be complemented with in situ hybridization of other genes, or immunohistochemistry of their proteins, both in whole mounted developing organs or sections for detailed resolution of the cellular and tissue expression alterations. Combining multiple gene alterations requires the use of cre or flipase to conditionally delete genes and avoid embryonic lethality. Required breeding schemes dramatically enhance effort and cost proportional to the number of genes mutated, with an outcome of very few animals with the full repertoire of genetic modifications desired. Amortizing the vast amount of effort and time to obtain these few precious specimens that are carrying multiple mutations necessitates tissue optimization. Moreover, investigating a single animal with multiple techniques makes it easier to correlate gene deletion defects with expression profiles. We have developed a technique to obtain a more thorough analysis of a given animal; with the ability to analyze several different histologically recognizable structures as well as gene and protein expression all from the same specimen in both whole mounted organs and sections. Although mice have been utilized to demonstrate the effectiveness of this technique it can be applied to a wide array of animals. To do this we combine lipophilic dye tracing, whole mount in situ hybridization, immunohistochemistry, and histology to extract the maximal possible amount of data.

  18. Combining Lipophilic dye, in situ Hybridization, Immunohistochemistry, and Histology

    PubMed Central

    Duncan, Jeremy; Kersigo, Jennifer; Gray, Brian; Fritzsch, Bernd

    2011-01-01

    Going beyond single gene function to cut deeper into gene regulatory networks requires multiple mutations combined in a single animal. Such analysis of two or more genes needs to be complemented with in situ hybridization of other genes, or immunohistochemistry of their proteins, both in whole mounted developing organs or sections for detailed resolution of the cellular and tissue expression alterations. Combining multiple gene alterations requires the use of cre or flipase to conditionally delete genes and avoid embryonic lethality. Required breeding schemes dramatically enhance effort and cost proportional to the number of genes mutated, with an outcome of very few animals with the full repertoire of genetic modifications desired. Amortizing the vast amount of effort and time to obtain these few precious specimens that are carrying multiple mutations necessitates tissue optimization. Moreover, investigating a single animal with multiple techniques makes it easier to correlate gene deletion defects with expression profiles. We have developed a technique to obtain a more thorough analysis of a given animal; with the ability to analyze several different histologically recognizable structures as well as gene and protein expression all from the same specimen in both whole mounted organs and sections. Although mice have been utilized to demonstrate the effectiveness of this technique it can be applied to a wide array of animals. To do this we combine lipophilic dye tracing, whole mount in situ hybridization, immunohistochemistry, and histology to extract the maximal possible amount of data. PMID:21445047

  19. Catabolite and Oxygen Regulation of Enterohemorrhagic Escherichia coli Virulence.

    PubMed

    Carlson-Banning, Kimberly M; Sperandio, Vanessa

    2016-11-22

    The biogeography of the gut is diverse in its longitudinal axis, as well as within specific microenvironments. Differential oxygenation and nutrient composition drive the membership of microbial communities in these habitats. Moreover, enteric pathogens can orchestrate further modifications to gain a competitive advantage toward host colonization. These pathogens are versatile and adept when exploiting the human colon. They expertly navigate complex environmental cues and interkingdom signaling to colonize and infect their hosts. Here we demonstrate how enterohemorrhagic Escherichia coli (EHEC) uses three sugar-sensing transcription factors, Cra, KdpE, and FusR, to exquisitely regulate the expression of virulence factors associated with its type III secretion system (T3SS) when exposed to various oxygen concentrations. We also explored the effect of mucin-derived nonpreferred carbon sources on EHEC growth and expression of virulence genes. Taken together, the results show that EHEC represses the expression of its T3SS when oxygen is absent, mimicking the largely anaerobic lumen, and activates its T3SS when oxygen is available through Cra. In addition, when EHEC senses mucin-derived sugars heavily present in the O-linked and N-linked glycans of the large intestine, virulence gene expression is initiated. Sugars derived from pectin, a complex plant polysaccharide digested in the large intestine, also increased virulence gene expression. Not only does EHEC sense host- and microbiota-derived interkingdom signals, it also uses oxygen availability and mucin-derived sugars liberated by the microbiota to stimulate expression of the T3SS. This precision in gene regulation allows EHEC to be an efficient pathogen with an extremely low infectious dose. Enteric pathogens have to be crafty when interpreting multiple environmental cues to successfully establish themselves within complex and diverse gut microenvironments. Differences in oxygen tension and nutrient composition determine the biogeography of the gut microbiota and provide unique niches that can be exploited by enteric pathogens. EHEC is an enteric pathogen that colonizes the colon and causes outbreaks of bloody diarrhea and hemolytic-uremic syndrome worldwide. It has a very low infectious dose, which requires it to be an extremely effective pathogen. Hence, here we show that EHEC senses multiple sugar sources and oxygen levels to optimally control the expression of its virulence repertoire. This exquisite regulatory control equips EHEC to sense different intestinal compartments to colonize the host. Copyright © 2016 Carlson-Banning and Sperandio.

  20. Learning style and concept acquisition of community college students in introductory biology

    NASA Astrophysics Data System (ADS)

    Bobick, Sandra Burin

    This study investigated the influence of learning style on concept acquisition within a sample of community college students in a general biology course. There are two subproblems within the larger problem: (1) the influence of demographic variables (age, gender, number of college credits, prior exposure to scientific information) on learning style, and (2) the correlations between prior scientific knowledge, learning style and student understanding of the concept of the gene. The sample included all students enrolled in an introductory general biology course during two consecutive semesters at an urban community college. Initial data was gathered during the first week of the semester, at which time students filled in a short questionnaire (age, gender, number of college credits, prior exposure to science information either through reading/visual sources or a prior biology course). Subjects were then given the Inventory of Learning Processes-Revised (ILP-R) which measures general preferences in five learning styles; Deep Learning; Elaborative Learning, Agentic Learning, Methodical Learning and Literal Memorization. Subjects were then given the Gene Conceptual Knowledge pretest: a 15 question objective section and an essay section. Subjects were exposed to specific concepts during lecture and laboratory exercises. At the last lab, students were given the Genetics Conceptual Knowledge Posttest. Pretest/posttest gains were correlated with demographic variables and learning styles were analyzed for significant correlations. Learning styles, as the independent variable in a simultaneous multiple regression, were significant predictors of results on the gene assessment tests, including pretest, posttest and gain. Of the learning styles, Deep Learning accounted for the greatest positive predictive value of pretest essay and pretest objective results. Literal Memorization was a significant negative predictor for posttest essay, essay gain and objective gain. Simultaneous multiple regression indicated that demographic variables were significant positive predictors for Methodical, Deep and Elaborative Learning Styles. Stepwise multiple regression resulted in number of credits, Read Science and gender (female) as significant predictors of learning styles. The findings of this study emphasize the importance of learning styles in conceptual understanding of the gene and the correlation of nonformal exposure to science information with learning style and conceptual understanding.

  1. Array data extractor (ADE): a LabVIEW program to extract and merge gene array data

    PubMed Central

    2013-01-01

    Background Large data sets from gene expression array studies are publicly available offering information highly valuable for research across many disciplines ranging from fundamental to clinical research. Highly advanced bioinformatics tools have been made available to researchers, but a demand for user-friendly software allowing researchers to quickly extract expression information for multiple genes from multiple studies persists. Findings Here, we present a user-friendly LabVIEW program to automatically extract gene expression data for a list of genes from multiple normalized microarray datasets. Functionality was tested for 288 class A G protein-coupled receptors (GPCRs) and expression data from 12 studies comparing normal and diseased human hearts. Results confirmed known regulation of a beta 1 adrenergic receptor and further indicate novel research targets. Conclusions Although existing software allows for complex data analyses, the LabVIEW based program presented here, “Array Data Extractor (ADE)”, provides users with a tool to retrieve meaningful information from multiple normalized gene expression datasets in a fast and easy way. Further, the graphical programming language used in LabVIEW allows applying changes to the program without the need of advanced programming knowledge. PMID:24289243

  2. Egg Case Silk Gene Sequences from Argiope Spiders: Evidence for Multiple Loci and a Loss of Function Between Paralogs

    PubMed Central

    Chaw, R. Crystal; Collin, Matthew; Wimmer, Marjorie; Helmrick, Kara-Leigh; Hayashi, Cheryl Y.

    2017-01-01

    Spiders swath their eggs with silk to protect developing embryos and hatchlings. Egg case silks, like other fibrous spider silks, are primarily composed of proteins called spidroins (spidroin = spider-fibroin). Silks, and thus spidroins, are important throughout the lives of spiders, yet the evolution of spidroin genes has been relatively understudied. Spidroin genes are notoriously difficult to sequence because they are typically very long (≥ 10 kb of coding sequence) and highly repetitive. Here, we investigate the evolution of spider silk genes through long-read sequencing of Bacterial Artificial Chromosome (BAC) clones. We demonstrate that the silver garden spider Argiope argentata has multiple egg case spidroin loci with a loss of function at one locus. We also use degenerate PCR primers to search the genomic DNA of congeneric species and find evidence for multiple egg case spidroin loci in other Argiope spiders. Comparative analyses show that these multiple loci are more similar at the nucleotide level within a species than between species. This pattern is consistent with concerted evolution homogenizing gene copies within a genome. More complicated explanations include convergent evolution or recent independent gene duplications within each species. PMID:29127108

  3. Escherichia coli O157:H7 Strain EDL933 Harbors Multiple Functional Prophage-Associated Genes Necessary for the Utilization of 5-N-Acetyl-9-O-Acetyl Neuraminic Acid as a Growth Substrate

    PubMed Central

    Saile, Nadja; Voigt, Anja; Kessler, Sarah; Stressler, Timo; Fischer, Lutz

    2016-01-01

    ABSTRACT Enterohemorrhagic Escherichia coli (EHEC) O157:H7 strain EDL933 harbors multiple prophage-associated open reading frames (ORFs) in its genome which are highly homologous to the chromosomal nanS gene. The latter is part of the nanCMS operon, which is present in most E. coli strains and encodes an esterase which is responsible for the monodeacetylation of 5-N-acetyl-9-O-acetyl neuraminic acid (Neu5,9Ac2). Whereas one prophage-borne ORF (z1466) has been characterized in previous studies, the functions of the other nanS-homologous ORFs are unknown. In the current study, the nanS-homologous ORFs of EDL933 were initially studied in silico. Due to their homology to the chromosomal nanS gene and their location in prophage genomes, we designated them nanS-p and numbered the different nanS-p alleles consecutively from 1 to 10. The two alleles nanS-p2 and nanS-p4 were selected for production of recombinant proteins, their enzymatic activities were investigated, and differences in their temperature optima were found. Furthermore, a function of these enzymes in substrate utilization could be demonstrated using an E. coli C600ΔnanS mutant in a growth medium with Neu5,9Ac2 as the carbon source and supplementation with the different recombinant NanS-p proteins. Moreover, generation of sequential deletions of all nanS-p alleles in strain EDL933 and subsequent growth experiments demonstrated a gene dose effect on the utilization of Neu5,9Ac2. Since Neu5,9Ac2 is an important component of human and animal gut mucus and since the nutrient availability in the large intestine is limited, we hypothesize that the presence of multiple Neu5,9Ac2 esterases provides them a nutrient supply under certain conditions in the large intestine, even if particular prophages are lost. IMPORTANCE In this study, a group of homologous prophage-borne nanS-p alleles and two of the corresponding enzymes of enterohemorrhagic E. coli (EHEC) O157:H7 strain EDL933 that may be important to provide alternative genes for substrate utilization were characterized. PMID:27474715

  4. Phosphite, an analog of phosphate, suppresses the coordinated expression of genes under phosphate starvation.

    PubMed

    Varadarajan, Deepa K; Karthikeyan, Athikkattuvalasu S; Matilda, Paino Durzo; Raghothama, Kashchandra G

    2002-07-01

    Phosphate (Pi) and its analog phosphite (Phi) are acquired by plants via Pi transporters. Although the uptake and mobility of Phi and Pi are similar, there is no evidence suggesting that plants can utilize Phi as a sole source of phosphorus. Phi is also known to interfere with many of the Pi starvation responses in plants and yeast (Saccharomyces cerevisiae). In this study, effects of Phi on plant growth and coordinated expression of genes induced by Pi starvation were analyzed. Phi suppressed many of the Pi starvation responses that are commonly observed in plants. Enhanced root growth and root to shoot ratio, a hallmark of Pi stress response, was strongly inhibited by Phi. The negative effects of Phi were not obvious in plants supplemented with Pi. The expression of Pi starvation-induced genes such as LePT1, LePT2, AtPT1, and AtPT2 (high-affinity Pi transporters); LePS2 (a novel acid phosphatase); LePS3 and TPSI1 (novel genes); and PAP1 (purple acid phosphatase) was suppressed by Phi in plants and cell cultures. Expression of luciferase reporter gene driven by the Pi starvation-induced AtPT2 promoter was also suppressed by Phi. These analyses showed that suppression of Pi starvation-induced genes is an early response to addition of Phi. These data also provide evidence that Phi interferes with gene expression at the level of transcription. Synchronized suppression of multiple Pi starvation-induced genes by Phi points to its action on the early molecular events, probably signal transduction, in Pi starvation response.

  5. Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining.

    PubMed

    Kreula, Sanna M; Kaewphan, Suwisa; Ginter, Filip; Jones, Patrik R

    2018-01-01

    The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to ( i ) discover novel candidate associations between different genes or proteins in the network, and ( ii ) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.

  6. A Novel Tightly Regulated Gene Expression System for the Human Intestinal Symbiont Bacteroides thetaiotaomicron

    PubMed Central

    Horn, Nikki; Carvalho, Ana L.; Overweg, Karin; Wegmann, Udo; Carding, Simon R.; Stentz, Régis

    2016-01-01

    There is considerable interest in studying the function of Bacteroides species resident in the human gastrointestinal (GI)-tract and the contribution they make to host health. Reverse genetics and protein expression techniques, such as those developed for well-characterized Escherichia coli cannot be applied to Bacteroides species as they and other members of the Bacteriodetes phylum have unique promoter structures. The availability of useful Bacteroides-specific genetic tools is therefore limited. Here we describe the development of an effective mannan-controlled gene expression system for Bacteroides thetaiotaomicron containing the mannan-inducible promoter–region of an α-1,2-mannosidase gene (BT_3784), a ribosomal binding site designed to modulate expression, a multiple cloning site to facilitate the cloning of genes of interest, and a transcriptional terminator. Using the Lactobacillus pepI as a reporter gene, mannan induction resulted in an increase of reporter activity in a time- and concentration-dependent manner with a wide range of activity. The endogenous BtcepA cephalosporinase gene was used to demonstrate the suitability of this novel expression system, enabling the isolation of a His-tagged version of BtCepA. We have also shown with experiments performed in mice that the system can be induced in vivo in the presence of an exogenous source of mannan. By enabling the controlled expression of endogenous and exogenous genes in B. thetaiotaomicron this novel inducer-dependent expression system will aid in defining the physiological role of individual genes and the functional analyses of their products. PMID:27468280

  7. Gene prioritization and clustering by multi-view text mining

    PubMed Central

    2010-01-01

    Background Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. Results We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. Conclusions In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification. PMID:20074336

  8. Regulation of nitrogen metabolism by GATA zinc finger transcription factors in Yarrowia lipolytica

    DOE PAGES

    Pomraning, Kyle R.; Bredeweg, Erin L.; Baker, Scott E.; ...

    2017-02-15

    Here, fungi accumulate lipids in a manner dependent on the quantity and quality of the nitrogen source on which they are growing. In the oleaginous yeast Yarrowia lipolytica, growth on a complex source of nitrogen enables rapid growth and limited accumulation of neutral lipids, while growth on a simple nitrogen source promotes lipid accumulation in large lipid droplets. Here we examined the roles of nitrogen catabolite repression and its regulation by GATA zinc finger transcription factors on lipid metabolism in Y. lipolytica. Deletion of the GATA transcription factor genes gzf3 and gzf2 resulted in nitrogen source-specific growth defects and greatermore » accumulation of lipids when the cells were growing on a simple nitrogen source. Deletion of gzf1, which is most similar to activators of genes repressed by nitrogen catabolite repression in filamentous ascomycetes, did not affect growth on the nitrogen sources tested. We examined gene expression of wild-type and GATA transcription factor mutants on simple and complex nitrogen sources and found that expression of enzymes involved in malate metabolism, beta-oxidation, and ammonia utilization are strongly upregulated on a simple nitrogen source. Deletion of gzf3 results in overexpression of genes with GATAA sites in their promoters, suggesting that it acts as a repressor, while gzf2 is required for expression of ammonia utilization genes but does not grossly affect the transcription level of genes predicted to be controlled by nitrogen catabolite repression. Both GATA transcription factor mutants exhibit decreased expression of genes controlled by carbon catabolite repression via the repressor mig1, including genes for beta-oxidation, highlighting the complex interplay between regulation of carbon, nitrogen, and lipid metabolism.« less

  9. Regulation of nitrogen metabolism by GATA zinc finger transcription factors in Yarrowia lipolytica

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pomraning, Kyle R.; Bredeweg, Erin L.; Baker, Scott E.

    Here, fungi accumulate lipids in a manner dependent on the quantity and quality of the nitrogen source on which they are growing. In the oleaginous yeast Yarrowia lipolytica, growth on a complex source of nitrogen enables rapid growth and limited accumulation of neutral lipids, while growth on a simple nitrogen source promotes lipid accumulation in large lipid droplets. Here we examined the roles of nitrogen catabolite repression and its regulation by GATA zinc finger transcription factors on lipid metabolism in Y. lipolytica. Deletion of the GATA transcription factor genes gzf3 and gzf2 resulted in nitrogen source-specific growth defects and greatermore » accumulation of lipids when the cells were growing on a simple nitrogen source. Deletion of gzf1, which is most similar to activators of genes repressed by nitrogen catabolite repression in filamentous ascomycetes, did not affect growth on the nitrogen sources tested. We examined gene expression of wild-type and GATA transcription factor mutants on simple and complex nitrogen sources and found that expression of enzymes involved in malate metabolism, beta-oxidation, and ammonia utilization are strongly upregulated on a simple nitrogen source. Deletion of gzf3 results in overexpression of genes with GATAA sites in their promoters, suggesting that it acts as a repressor, while gzf2 is required for expression of ammonia utilization genes but does not grossly affect the transcription level of genes predicted to be controlled by nitrogen catabolite repression. Both GATA transcription factor mutants exhibit decreased expression of genes controlled by carbon catabolite repression via the repressor mig1, including genes for beta-oxidation, highlighting the complex interplay between regulation of carbon, nitrogen, and lipid metabolism.« less

  10. Divergent gene copies in the asexual class Bdelloidea (Rotifera) separated before the bdelloid radiation or within bdelloid families.

    PubMed

    Mark Welch, David B; Cummings, Michael P; Hillis, David M; Meselson, Matthew

    2004-02-10

    Rotifers of the asexual class Bdelloidea are unusual in possessing two or more divergent copies of every gene that has been examined. Phylogenetic analysis of the heat-shock gene hsp82 and the TATA-box-binding protein gene tbp in multiple bdelloid species suggested that for each gene, each copy belonged to one of two lineages that began to diverge before the bdelloid radiation. Such gene trees are consistent with the two lineages having descended from former alleles that began to diverge after meiotic segregation ceased or from subgenomes of an alloploid ancestor of the bdelloids. However, the original analyses of bdelloid gene-copy divergence used only a single outgroup species and were based on parsimony and neighbor joining. We have now used maximum likelihood and Bayesian inference methods and, for hsp82, multiple outgroups in an attempt to produce more robust gene trees. Here we report that the available data do not unambiguously discriminate between gene trees that root the origin of hsp82 and tbp copy divergence before the bdelloid radiation and those which indicate that the gene copies began to diverge within bdelloid families. The remarkable presence of multiple diverged gene copies in individual genomes is nevertheless consistent with the loss of sex in an ancient ancestor of bdelloids.

  11. Gene panel testing for hereditary breast cancer.

    PubMed

    Winship, Ingrid; Southey, Melissa C

    2016-03-21

    Inherited predisposition to breast cancer is explained only in part by mutations in the BRCA1 and BRCA2 genes. Most families with an apparent familial clustering of breast cancer who are investigated through Australia's network of genetic services and familial cancer centres do not have mutations in either of these genes. More recently, additional breast cancer predisposition genes, such as PALB2, have been identified. New genetic technology allows a panel of multiple genes to be tested for mutations in a single test. This enables more women and their families to have risk assessment and risk management, in a preventive approach to predictable breast cancer. Predictive testing for a known family-specific mutation in a breast cancer predisposition gene provides personalised risk assessment and evidence-based risk management. Breast cancer predisposition gene panel tests have a greater diagnostic yield than conventional testing of only the BRCA1 and BRCA2 genes. The clinical validity and utility of some of the putative breast cancer predisposition genes is not yet clear. Ethical issues warrant consideration, as multiple gene panel testing has the potential to identify secondary findings not originally sought by the test requested. Multiple gene panel tests may provide an affordable and effective way to investigate the heritability of breast cancer.

  12. ePlant and the 3D data display initiative: integrative systems biology on the world wide web.

    PubMed

    Fucile, Geoffrey; Di Biase, David; Nahal, Hardeep; La, Garon; Khodabandeh, Shokoufeh; Chen, Yani; Easley, Kante; Christendat, Dinesh; Kelley, Lawrence; Provart, Nicholas J

    2011-01-10

    Visualization tools for biological data are often limited in their ability to interactively integrate data at multiple scales. These computational tools are also typically limited by two-dimensional displays and programmatic implementations that require separate configurations for each of the user's computing devices and recompilation for functional expansion. Towards overcoming these limitations we have developed "ePlant" (http://bar.utoronto.ca/eplant) - a suite of open-source world wide web-based tools for the visualization of large-scale data sets from the model organism Arabidopsis thaliana. These tools display data spanning multiple biological scales on interactive three-dimensional models. Currently, ePlant consists of the following modules: a sequence conservation explorer that includes homology relationships and single nucleotide polymorphism data, a protein structure model explorer, a molecular interaction network explorer, a gene product subcellular localization explorer, and a gene expression pattern explorer. The ePlant's protein structure explorer module represents experimentally determined and theoretical structures covering >70% of the Arabidopsis proteome. The ePlant framework is accessed entirely through a web browser, and is therefore platform-independent. It can be applied to any model organism. To facilitate the development of three-dimensional displays of biological data on the world wide web we have established the "3D Data Display Initiative" (http://3ddi.org).

  13. Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

    PubMed

    Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

    2018-03-01

    Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.

  14. ePlant: Visualizing and Exploring Multiple Levels of Data for Hypothesis Generation in Plant Biology[OPEN

    PubMed Central

    Waese, Jamie; Fan, Jim; Yu, Hans; Fucile, Geoffrey; Shi, Ruian; Cumming, Matthew; Town, Chris; Stuerzlinger, Wolfgang

    2017-01-01

    A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using separate tools. The high cognitive load required to navigate such a workflow is detrimental to hypothesis generation. Accordingly, there is a need for a robust research platform that incorporates all data and provides integrated search, analysis, and visualization features through a single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), a visual analytic tool for exploring multiple levels of Arabidopsis thaliana data through a zoomable user interface. ePlant connects to several publicly available web services to download genome, proteome, interactome, transcriptome, and 3D molecular structure data for one or more genes or gene products of interest. Data are displayed with a set of visualization tools that are presented using a conceptual hierarchy from big to small, and many of the tools combine information from more than one data type. We describe the development of ePlant in this article and present several examples illustrating its integrative features for hypothesis generation. We also describe the process of deploying ePlant as an “app” on Araport. Building on readily available web services, the code for ePlant is freely available for any other biological species research. PMID:28808136

  15. Bacteriophage P2 ogr and P4 delta genes act independently and are essential for P4 multiplication.

    PubMed Central

    Halling, C; Calendar, R

    1990-01-01

    Satellite bacteriophage P4 requires the products of the late genes of a helper phage such as P2 for lytic growth. Expression of the P2 late genes is positively regulated by the P2 ogr gene in a process requiring P2 DNA replication. Transactivation of P2 late gene expression by P4 requires the P4 delta gene product and works even in the absence of P2 DNA replication. We have made null mutants of the P2 ogr and P4 delta genes. In the absence of the P4 delta gene product, P4 multiplication required both the P2 ogr protein and P2 DNA replication. In the absence of the P2 ogr gene product, P4 multiplication required the P4 delta protein. In complementation experiments, we found that the P2 ogr protein was made in the absence of P2 DNA replication but could not function unless P2 DNA replicated. We produced P4 delta protein from a plasmid and found that it complemented the null P4 delta and P2 ogr mutants. Images PMID:2193911

  16. A new fast method for inferring multiple consensus trees using k-medoids.

    PubMed

    Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

    2018-04-05

    Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while providing similar or better clustering results in most cases. This makes it particularly well suited for the analysis of large genomic and phylogenetic datasets.

  17. Nitrogen Cycle Evaluation (NiCE) Chip for the Simultaneous Analysis of Multiple N-Cycle Associated Genes.

    PubMed

    Oshiki, Mamoru; Segawa, Takahiro; Ishii, Satoshi

    2018-02-02

    Various microorganisms play key roles in the Nitrogen (N) cycle. Quantitative PCR (qPCR) and PCR-amplicon sequencing of the N cycle functional genes allow us to analyze the abundance and diversity of microbes responsible in the N transforming reactions in various environmental samples. However, analysis of multiple target genes can be cumbersome and expensive. PCR-independent analysis, such as metagenomics and metatranscriptomics, is useful but expensive especially when we analyze multiple samples and try to detect N cycle functional genes present at relatively low abundance. Here, we present the application of microfluidic qPCR chip technology to simultaneously quantify and prepare amplicon sequence libraries for multiple N cycle functional genes as well as taxon-specific 16S rRNA gene markers for many samples. This approach, named as N cycle evaluation (NiCE) chip, was evaluated by using DNA from pure and artificially mixed bacterial cultures and by comparing the results with those obtained by conventional qPCR and amplicon sequencing methods. Quantitative results obtained by the NiCE chip were comparable to those obtained by conventional qPCR. In addition, the NiCE chip was successfully applied to examine abundance and diversity of N cycle functional genes in wastewater samples. Although non-specific amplification was detected on the NiCE chip, this could be overcome by optimizing the primer sequences in the future. As the NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes, this tool should advance our ability to explore N cycling in various samples. Importance. We report a novel approach, namely Nitrogen Cycle Evaluation (NiCE) chip by using microfluidic qPCR chip technology. By sequencing the amplicons recovered from the NiCE chip, we can assess diversities of the N cycle functional genes. The NiCE chip technology is applicable to analyze the temporal dynamics of the N cycle gene transcriptions in wastewater treatment bioreactors. The NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes. While there is a room for future improvement, this tool should significantly advance our ability to explore the N cycle in various environmental samples. Copyright © 2018 American Society for Microbiology.

  18. Metabolic Coevolution in the Bacterial Symbiosis of Whiteflies and Related Plant Sap-Feeding Insects.

    PubMed

    Luan, Jun-Bo; Chen, Wenbo; Hasegawa, Daniel K; Simmons, Alvin M; Wintermantel, William M; Ling, Kai-Shu; Fei, Zhangjun; Liu, Shu-Sheng; Douglas, Angela E

    2015-09-15

    Genomic decay is a common feature of intracellular bacteria that have entered into symbiosis with plant sap-feeding insects. This study of the whitefly Bemisia tabaci and two bacteria (Portiera aleyrodidarum and Hamiltonella defensa) cohoused in each host cell investigated whether the decay of Portiera metabolism genes is complemented by host and Hamiltonella genes, and compared the metabolic traits of the whitefly symbiosis with other sap-feeding insects (aphids, psyllids, and mealybugs). Parallel genomic and transcriptomic analysis revealed that the host genome contributes multiple metabolic reactions that complement or duplicate Portiera function, and that Hamiltonella may contribute multiple cofactors and one essential amino acid, lysine. Homologs of the Bemisia metabolism genes of insect origin have also been implicated in essential amino acid synthesis in other sap-feeding insect hosts, indicative of parallel coevolution of shared metabolic pathways across multiple symbioses. Further metabolism genes coded in the Bemisia genome are of bacterial origin, but phylogenetically distinct from Portiera, Hamiltonella and horizontally transferred genes identified in other sap-feeding insects. Overall, 75% of the metabolism genes of bacterial origin are functionally unique to one symbiosis, indicating that the evolutionary history of metabolic integration in these symbioses is strongly contingent on the pattern of horizontally acquired genes. Our analysis, further, shows that bacteria with genomic decay enable host acquisition of complex metabolic pathways by multiple independent horizontal gene transfers from exogenous bacteria. Specifically, each horizontally acquired gene can function with other genes in the pathway coded by the symbiont, while facilitating the decay of the symbiont gene coding the same reaction. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. Overlooked Short Toxin-Like Proteins: A Shortcut to Drug Design

    PubMed Central

    Linial, Michal

    2017-01-01

    Short stable peptides have huge potential for novel therapies and biosimilars. Cysteine-rich short proteins are characterized by multiple disulfide bridges in a compact structure. Many of these metazoan proteins are processed, folded, and secreted as soluble stable folds. These properties are shared by both marine and terrestrial animal toxins. These stable short proteins are promising sources for new drug development. We developed ClanTox (classifier of animal toxins) to identify toxin-like proteins (TOLIPs) using machine learning models trained on a large-scale proteomic database. Insects proteomes provide a rich source for protein innovations. Therefore, we seek overlooked toxin-like proteins from insects (coined iTOLIPs). Out of 4180 short (<75 amino acids) secreted proteins, 379 were predicted as iTOLIPs with high confidence, with as many as 30% of the genes marked as uncharacterized. Based on bioinformatics, structure modeling, and data-mining methods, we found that the most significant group of predicted iTOLIPs carry antimicrobial activity. Among the top predicted sequences were 120 termicin genes from termites with antifungal properties. Structural variations of insect antimicrobial peptides illustrate the similarity to a short version of the defensin fold with antifungal specificity. We also identified 9 proteins that strongly resemble ion channel inhibitors from scorpion and conus toxins. Furthermore, we assigned functional fold to numerous uncharacterized iTOLIPs. We conclude that a systematic approach for finding iTOLIPs provides a rich source of peptides for drug design and innovative therapeutic discoveries. PMID:29109389

  20. Genetic Diversity of Bactrocera dorsalis (Diptera: Tephritidae) on the Hawaiian Islands: Implications for an Introduction Pathway Into California.

    PubMed

    Barr, Norman B; Ledezma, Lisa A; Leblanc, Luc; San Jose, Michael; Rubinoff, Daniel; Geib, Scott M; Fujita, Brian; Bartels, David W; Garza, Daniel; Kerr, Peter; Hauser, Martin; Gaimari, Stephen

    2014-10-01

    Population genetic diversity of the oriental fruit fly, Bactrocera dorsalis (Hendel), on the Hawaiian islands of Oahu, Maui, Kauai, and Hawaii (the Big Island) was estimated using DNA sequences of the mitochondrial cytochrome c oxidase subunit I gene. In total, 932 flies representing 36 sampled sites across the four islands were sequenced for a 1,500-bp fragment of the gene named the C1500 marker. Genetic variation was low on the Hawaiian Islands with >96% of flies having just two haplotypes: C1500-Haplotype 1 (63.2%) or C1500-Haplotype 2 (33.3%). The other 33 flies (3.5%) had haplotypes similar to the two dominant haplotypes. No population structure was detected among the islands or within islands. The two haplotypes were present at similar frequencies at each sample site, suggesting that flies on the various islands can be considered one population. Comparison of the Hawaiian data set to DNA sequences of 165 flies from outbreaks in California between 2006 and 2012 indicates that a single-source introduction pathway of Hawaiian origin cannot explain many of the flies in California. Hawaii, however, could not be excluded as a maternal source for 69 flies. There was no clear geographic association for Hawaiian or non-Hawaiian haplotypes in the Bay Area or Los Angeles Basin over time. This suggests that California experienced multiple, independent introductions from different sources. © 2014 Entomological Society of America.

  1. The Role of Multiple Transcription Factors In Archaeal Gene Expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Charles J. Daniels

    2008-09-23

    Since the inception of this research program, the project has focused on two central questions: What is the relationship between the 'eukaryal-like' transcription machinery of archaeal cells and its counterparts in eukaryal cells? And, how does the archaeal cell control gene expression using its mosaic of eukaryal core transcription machinery and its bacterial-like transcription regulatory proteins? During the grant period we have addressed these questions using a variety of in vivo approaches and have sought to specifically define the roles of the multiple TATA binding protein (TBP) and TFIIB-like (TFB) proteins in controlling gene expression in Haloferax volcanii. H. volcaniimore » was initially chosen as a model for the Archaea based on the availability of suitable genetic tools; however, later studies showed that all haloarchaea possessed multiple tbp and tfb genes, which led to the proposal that multiple TBP and TFB proteins may function in a manner similar to alternative sigma factors in bacterial cells. In vivo transcription and promoter analysis established a clear relationship between the promoter requirements of haloarchaeal genes and those of the eukaryal RNA polymerase II promoter. Studies on heat shock gene promoters, and the demonstration that specific tfb genes were induced by heat shock, provided the first indication that TFB proteins may direct expression of specific gene families. The construction of strains lacking tbp or tfb genes, coupled with the finding that many of these genes are differentially expressed under varying growth conditions, provided further support for this model. Genetic tools were also developed that led to the construction of insertion and deletion mutants, and a novel gene expression scheme was designed that allowed the controlled expression of these genes in vivo. More recent studies have used a whole genome array to examine the expression of these genes and we have established a linkage between the expression of specific tfb genes and the regulation of nitrogen metabolism and other global cellular responses.« less

  2. The genome and phenome of the green alga Chloroidium sp. UTEX 3007 reveal adaptive traits for desert acclimatization.

    PubMed

    Nelson, David R; Khraiwesh, Basel; Fu, Weiqi; Alseekh, Saleh; Jaiswal, Ashish; Chaiboonchoe, Amphun; Hazzouri, Khaled M; O'Connor, Matthew J; Butterfoss, Glenn L; Drou, Nizar; Rowe, Jillian D; Harb, Jamil; Fernie, Alisdair R; Gunsalus, Kristin C; Salehi-Ashtiani, Kourosh

    2017-06-17

    To investigate the phenomic and genomic traits that allow green algae to survive in deserts, we characterized a ubiquitous species, Chloroidium sp. UTEX 3007 , which we isolated from multiple locations in the United Arab Emirates (UAE). Metabolomic analyses of Chloroidium sp. UTEX 3007 indicated that the alga accumulates a broad range of carbon sources, including several desiccation tolerance-promoting sugars and unusually large stores of palmitate. Growth assays revealed capacities to grow in salinities from zero to 60 g/L and to grow heterotrophically on >40 distinct carbon sources. Assembly and annotation of genomic reads yielded a 52.5 Mbp genome with 8153 functionally annotated genes. Comparison with other sequenced green algae revealed unique protein families involved in osmotic stress tolerance and saccharide metabolism that support phenomic studies. Our results reveal the robust and flexible biology utilized by a green alga to successfully inhabit a desert coastline.

  3. Microbial ecology of extreme environments: Antarctic dry valley yeasts and growth in substrate limited habitats

    NASA Technical Reports Server (NTRS)

    Vishniac, H. S.

    1981-01-01

    The multiple stresses temperature, moisture, and for chemoheterotrophs, sources of carbon and energy of the Dry Valley Antarctica soils allow at best depauperate communities, low in species diversity and population density. The nature of community structure, the operation of biogeochemical cycles, the evolution and mechanisms of adaptation to this habitat are of interest in informing speculations upon life on other planets as well as in modeling the limits of gene life. Yeasts of the Cryptococcus vishniacil complex (Basidiobiastomycetes) are investigated, as the only known indigenes of the most hostile, lichen free, parts of the Dry Valleys. Methods were developed for isolating these yeasts (methods which do not exclude the recovery of other microbiota). The definition of the complex was refined and the importance of nitrogen sources was established as well as substrate competition in fitness to the Dry Valley habitats.

  4. A Robust CRISPR/Cas9 System for Convenient, High-Efficiency Multiplex Genome Editing in Monocot and Dicot Plants.

    PubMed

    Ma, Xingliang; Zhang, Qunyu; Zhu, Qinlong; Liu, Wei; Chen, Yan; Qiu, Rong; Wang, Bin; Yang, Zhongfang; Li, Heying; Lin, Yuru; Xie, Yongyao; Shen, Rongxin; Chen, Shuifu; Wang, Zhi; Chen, Yuanling; Guo, Jingxin; Chen, Letian; Zhao, Xiucai; Dong, Zhicheng; Liu, Yao-Guang

    2015-08-01

    CRISPR/Cas9 genome targeting systems have been applied to a variety of species. However, most CRISPR/Cas9 systems reported for plants can only modify one or a few target sites. Here, we report a robust CRISPR/Cas9 vector system, utilizing a plant codon optimized Cas9 gene, for convenient and high-efficiency multiplex genome editing in monocot and dicot plants. We designed PCR-based procedures to rapidly generate multiple sgRNA expression cassettes, which can be assembled into the binary CRISPR/Cas9 vectors in one round of cloning by Golden Gate ligation or Gibson Assembly. With this system, we edited 46 target sites in rice with an average 85.4% rate of mutation, mostly in biallelic and homozygous status. We reasoned that about 16% of the homozygous mutations in rice were generated through the non-homologous end-joining mechanism followed by homologous recombination-based repair. We also obtained uniform biallelic, heterozygous, homozygous, and chimeric mutations in Arabidopsis T1 plants. The targeted mutations in both rice and Arabidopsis were heritable. We provide examples of loss-of-function gene mutations in T0 rice and T1 Arabidopsis plants by simultaneous targeting of multiple (up to eight) members of a gene family, multiple genes in a biosynthetic pathway, or multiple sites in a single gene. This system has provided a versatile toolbox for studying functions of multiple genes and gene families in plants for basic research and genetic improvement. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.

  5. Fractional populations in multiple gene inheritance.

    PubMed

    Chung, Myung-Hoon; Kim, Chul Koo; Nahm, Kyun

    2003-01-22

    With complete knowledge of the human genome sequence, one of the most interesting tasks remaining is to understand the functions of individual genes and how they communicate. Using the information about genes (locus, allele, mutation rate, fitness, etc.), we attempt to explain population demographic data. This population evolution study could complement and enhance biologists' understanding about genes. We present a general approach to study population genetics in complex situations. In the present approach, multiple allele inheritance, multiple loci inheritance, natural selection and mutations are allowed simultaneously in order to consider a more realistic situation. A simulation program is presented so that readers can readily carry out studies with their own parameters. It is shown that the multiplicity of the loci greatly affects the demographic results of fractional population ratios. Furthermore, the study indicates that some high infant mortality rates due to congenital anomalies can be attributed to multiple loci inheritance. The simulation program can be downloaded from http://won.hongik.ac.kr/~mhchung/index_files/yapop.htm. In order to run this program, one needs Visual Studio.NET platform, which can be downloaded from http://msdn.microsoft.com/netframework/downloads/default.asp.

  6. Fusagene vectors: a novel strategy for the expression of multiple genes from a single cistron.

    PubMed

    Gäken, J; Jiang, J; Daniel, K; van Berkel, E; Hughes, C; Kuiper, M; Darling, D; Tavassoli, M; Galea-Lauri, J; Ford, K; Kemeny, M; Russell, S; Farzaneh, F

    2000-12-01

    Transduction of cells with multiple genes, allowing their stable and co-ordinated expression, is difficult with the available methodologies. A method has been developed for expression of multiple gene products, as fusion proteins, from a single cistron. The encoded proteins are post-synthetically cleaved and processed into each of their constituent proteins as individual, biologically active factors. Specifically, linkers encoding cleavage sites for the Golgi expressed endoprotease, furin, have been incorporated between in-frame cDNA sequences encoding different secreted or membrane bound proteins. With this strategy we have developed expression vectors encoding multiple proteins (IL-2 and B7.1, IL-4 and B7.1, IL-4 and IL-2, IL-12 p40 and p35, and IL-12 p40, p35 and IL-2 ). Transduction and analysis of over 100 individual clones, derived from murine and human tumour cell lines, demonstrate the efficient expression and biological activity of each of the encoded proteins. Fusagene vectors enable the co-ordinated expression of multiple gene products from a single, monocistronic, expression cassette.

  7. Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa

    PubMed Central

    Doddapaneni, Harshavardhan; Yao, Jiqiang; Lin, Hong; Walker, M Andrew; Civerolo, Edwin L

    2006-01-01

    Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c), 54 (Dixon), 83 (Ann1) and 9 (Temecula-1). A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes have been identified as the main source of variations among strains, with individual strains showing different rates of genome evolution. Based on these genome comparisons, it appears that the Pierce's disease strain Temecula-1 genome represents the ancestral genome of the X. fastidiosa. Results of this analysis are publicly available in the form of a web database. PMID:16948851

  8. An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

    PubMed

    Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

    2014-06-01

    In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  9. Genetic redundancy and persistence of plasmid-mediated trimethoprim/sulfamethoxazole resistant effluent and stream water Escherichia coli.

    PubMed

    Suhartono, Suhartono; Savin, Mary; Gbur, Edward E

    2016-10-15

    Antibiotic resistant bacteria may persist in effluent receiving surface water in the presence of low (sub-inhibitory) antibiotic concentrations if the bacteria possess multiple genes encoding resistance to the same antibiotic. This redundancy of antibiotic resistance genes may occur in plasmids harboring conjugation and mobilization (mob) and integrase (intI) genes. Plasmids extracted from 76 sulfamethoxazole-trimethoprim resistant Escherichia coli originally isolated from effluent and an effluent-receiving stream were used as DNA template to identify sulfamethoxazole (sul) and trimethoprim (dfr) resistances genes plus detect the presence of intI and mob genes using PCR. Sulfamethoxazole and trimethoprim resistance was plasmid-mediated with three sul (sul1, sul2 and sul3 genes) and four dfr genes (dfrA12, dfrA8, dfrA17, and dfrA1 gene) the most prevalently detected. Approximately half of the plasmids carried class 1 and/or 2 integron and, although unrelated, half were also transmissible. Sampling site in relationship to effluent input significantly affected the number of intI and mob but not the number of sul and dfr genes. In the presence of low (sub-inhibitory) sulfamethoxazole concentration, isolates persisted regardless of integron and mobilization gene designation, whereas in the presence of trimethoprim, the presence of both integron and mobilization genes made isolates less persistent than in the absence of both or the presence of a gene from either group individually. Regardless, isolates persisted in large concentrations throughout the experiment. Treated effluent containing antibiotic resistant bacteria may be an important source of integrase and mobilization genes into the stream environment. Sulfamethoxazole-trimethoprim resistant bacteria may have a high degree of genetic redundancy and diversity carrying resistance to each antibiotic, although the role of integrase and mobilization genes towards persistence is unclear. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Simultaneous knockdown of six non-family genes using a single synthetic RNAi fragment in Arabidopsis thaliana

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Czarnecki, Olaf; Bryan, Anthony C.; Jawdy, Sara S.

    Genetic engineering of plants that results in successful establishment of new biochemical or regulatory pathways requires stable introduction of one or more genes into the plant genome. It might also be necessary to down-regulate or turn off expression of endogenous genes in order to reduce activity of competing pathways. An established way to knockdown gene expression in plants is expressing a hairpin-RNAi construct, eventually leading to degradation of a specifically targeted mRNA. Knockdown of multiple genes that do not share homologous sequences is still challenging and involves either sophisticated cloning strategies to create vectors with different serial expression constructs ormore » multiple transformation events that is often restricted by a lack of available transformation markers. Synthetic RNAi fragments were assembled in yeast carrying homologous sequences to six or seven non-family genes and introduced into pAGRIKOLA. Transformation of Arabidopsis thaliana and subsequent expression analysis of targeted genes proved efficient knockdown of all target genes. In conclusion, we present a simple and cost-effective method to create constructs to simultaneously knockdown multiple non-family genes or genes that do not share sequence homology. The presented method can be applied in plant and animal synthetic biology as well as traditional plant and animal genetic engineering.« less

  11. dbCPG: A web resource for cancer predisposition genes

    PubMed Central

    Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng

    2016-01-01

    Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes. PMID:27192119

  12. Simultaneous knockdown of six non-family genes using a single synthetic RNAi fragment in Arabidopsis thaliana

    DOE PAGES

    Czarnecki, Olaf; Bryan, Anthony C.; Jawdy, Sara S.; ...

    2016-02-17

    Genetic engineering of plants that results in successful establishment of new biochemical or regulatory pathways requires stable introduction of one or more genes into the plant genome. It might also be necessary to down-regulate or turn off expression of endogenous genes in order to reduce activity of competing pathways. An established way to knockdown gene expression in plants is expressing a hairpin-RNAi construct, eventually leading to degradation of a specifically targeted mRNA. Knockdown of multiple genes that do not share homologous sequences is still challenging and involves either sophisticated cloning strategies to create vectors with different serial expression constructs ormore » multiple transformation events that is often restricted by a lack of available transformation markers. Synthetic RNAi fragments were assembled in yeast carrying homologous sequences to six or seven non-family genes and introduced into pAGRIKOLA. Transformation of Arabidopsis thaliana and subsequent expression analysis of targeted genes proved efficient knockdown of all target genes. In conclusion, we present a simple and cost-effective method to create constructs to simultaneously knockdown multiple non-family genes or genes that do not share sequence homology. The presented method can be applied in plant and animal synthetic biology as well as traditional plant and animal genetic engineering.« less

  13. Simultaneous gut colonisation and infection by ESBL-producing Escherichia coli in hospitalised patients.

    PubMed

    Asir, Johny; Nair, Shashikala; Devi, Sheela; Prashanth, Kenchappa; Saranathan, Rajagopalan; Kanungo, Reba

    2015-01-01

    Extended spectrum betalactamase (ESBL)-producing organisms are a major cause of hospital-acquired infections. ESBL-producing Escherichia coli (E. coli) have been recovered from the hospital environment. These drug-resistant organisms have also been found to be present in humans as commensals. The present investigation intended to isolate ESBL-producing E. coli from the gut of already infected patients; to date, only a few studies have shown evidence of the gut microflora as a major source of infection. This study aimed to detect the presence of ESBL genes in E.coli that are isolated from the gut of patients who have already been infected with the same organism. A total of 70 non-repetitive faecal samples were collected from in-patients of our hospital. These in-patients were clinically diagnosed and were culture-positive for ESBL-producing E. coli either from blood, urine, or pus. Standard microbiological methods were used to detect ESBL from clinical and gut isolates. Genes coding for major betalactamase enzymes such as bla CTX-M , bla TEM, and bla SHV were investigated by polymerase chain reaction (PCR). ESBL-producing E. coli was isolated from 15 (21 per cent) faecal samples of the 70 samples that were cultured. PCR revealed that out of these 15 isolates, the bla CTX-M gene was found in 13 (86.6 per cent) isolates, the bla TEM was present in 11 (73.3 per cent) isolates, and bla SHV only in eight (53.3 per cent) isolates. All 15 clinical and gut isolates had similar phenotypic characters and eight of the 15 patients had similar pattern of genes (bla TEM, bla CTX-M, and bla SHV) in their clinical and gut isolates. Strains with multiple betalactamase genes that colonise the gut of hospitalised patients are a potential threat and it may be a potential source of infection.

  14. Single Cell Genome Amplification Accelerates Identification of the Apratoxin Biosynthetic Pathway from a Complex Microbial Assemblage

    PubMed Central

    Grindberg, Rashel V.; Ishoey, Thomas; Brinza, Dumitru; Esquenazi, Eduardo; Coates, R. Cameron; Liu, Wei-ting; Gerwick, Lena; Dorrestein, Pieter C.; Pevzner, Pavel; Lasken, Roger; Gerwick, William H.

    2011-01-01

    Filamentous marine cyanobacteria are extraordinarily rich sources of structurally novel, biomedically relevant natural products. To understand their biosynthetic origins as well as produce increased supplies and analog molecules, access to the clustered biosynthetic genes that encode for the assembly enzymes is necessary. Complicating these efforts is the universal presence of heterotrophic bacteria in the cell wall and sheath material of cyanobacteria obtained from the environment and those grown in uni-cyanobacterial culture. Moreover, the high similarity in genetic elements across disparate secondary metabolite biosynthetic pathways renders imprecise current gene cluster targeting strategies and contributes sequence complexity resulting in partial genome coverage. Thus, it was necessary to use a dual-method approach of single-cell genomic sequencing based on multiple displacement amplification (MDA) and metagenomic library screening. Here, we report the identification of the putative apratoxin. A biosynthetic gene cluster, a potent cancer cell cytotoxin with promise for medicinal applications. The roughly 58 kb biosynthetic gene cluster is composed of 12 open reading frames and has a type I modular mixed polyketide synthase/nonribosomal peptide synthetase (PKS/NRPS) organization and features loading and off-loading domain architecture never previously described. Moreover, this work represents the first successful isolation of a complete biosynthetic gene cluster from Lyngbya bouillonii, a tropical marine cyanobacterium renowned for its production of diverse bioactive secondary metabolites. PMID:21533272

  15. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

    PubMed

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H

    2015-07-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. NF-κB-Dependent Lymphoid Enhancer Co-option Promotes Renal Carcinoma Metastasis.

    PubMed

    Rodrigues, Paulo; Patel, Saroor A; Harewood, Louise; Olan, Ioana; Vojtasova, Erika; Syafruddin, Saiful E; Zaini, M Nazhif; Richardson, Emma K; Burge, Johanna; Warren, Anne Y; Stewart, Grant D; Saeb-Parsy, Kourosh; Samarajiwa, Shamith A; Vanharanta, Sakari

    2018-06-06

    Metastases, the spread of cancer cells to distant organs, cause the majority of cancer-related deaths. Few metastasis-specific driver mutations have been identified, suggesting aberrant gene regulation as a source of metastatic traits. However, how metastatic gene expression programs arise is poorly understood. Here, using human-derived metastasis models of renal cancer, we identify transcriptional enhancers that promote metastatic carcinoma progression. Specific enhancers and enhancer clusters are activated in metastatic cancer cell populations, and the associated gene expression patterns are predictive of poor patient outcome in clinical samples. We find that the renal cancer metastasis-associated enhancer complement consists of multiple coactivated tissue-specific enhancer modules. Specifically, we identify and functionally characterize a coregulatory enhancer cluster, activated by the renal cancer driver HIF2A and an NF-κB-driven lymphoid element, as a mediator of metastasis in vivo We conclude that oncogenic pathways can acquire metastatic phenotypes through cross-lineage co-option of physiologic epigenetic enhancer states. SIGNIFICANCE: Renal cancer is associated with significant mortality due to metastasis. We show that in metastatic renal cancer, functionally important metastasis genes are activated via co-option of gene regulatory enhancer modules from distant developmental lineages, thus providing clues to the origins of metastatic cancer. Cancer Discov; 8(7); 1-16. ©2018 AACR. ©2018 American Association for Cancer Research.

  17. Statistical mechanical model of coupled transcription from multiple promoters due to transcription factor titration

    PubMed Central

    Rydenfelt, Mattias; Cox, Robert Sidney; Garcia, Hernan; Phillips, Rob

    2014-01-01

    Transcription factors (TFs) with regulatory action at multiple promoter targets is the rule rather than the exception, with examples ranging from the cAMP receptor protein (CRP) in E. coli that regulates hundreds of different genes simultaneously to situations involving multiple copies of the same gene, such as plasmids, retrotransposons, or highly replicated viral DNA. When the number of TFs heavily exceeds the number of binding sites, TF binding to each promoter can be regarded as independent. However, when the number of TF molecules is comparable to the number of binding sites, TF titration will result in correlation (“promoter entanglement”) between transcription of different genes. We develop a statistical mechanical model which takes the TF titration effect into account and use it to predict both the level of gene expression for a general set of promoters and the resulting correlation in transcription rates of different genes. Our results show that the TF titration effect could be important for understanding gene expression in many regulatory settings. PMID:24580252

  18. Systems Biophysics of Gene Expression

    PubMed Central

    Vilar, Jose M.G.; Saiz, Leonor

    2013-01-01

    Gene expression is a process central to any form of life. It involves multiple temporal and functional scales that extend from specific protein-DNA interactions to the coordinated regulation of multiple genes in response to intracellular and extracellular changes. This diversity in scales poses fundamental challenges to the use of traditional approaches to fully understand even the simplest gene expression systems. Recent advances in computational systems biophysics have provided promising avenues to reliably integrate the molecular detail of biophysical process into the system behavior. Here, we review recent advances in the description of gene regulation as a system of biophysical processes that extend from specific protein-DNA interactions to the combinatorial assembly of nucleoprotein complexes. There is now basic mechanistic understanding on how promoters controlled by multiple, local and distal, DNA binding sites for transcription factors can actively control transcriptional noise, cell-to-cell variability, and other properties of gene regulation, including precision and flexibility of the transcriptional responses. PMID:23790365

  19. Activation of silenced cytokine gene promoters by the synergistic effect of TBP-TALE and VP64-TALE activators.

    PubMed

    Anthony, Kim; More, Abhijit; Zhang, Xiaoliu

    2014-01-01

    Recent work has shown that the combinatorial use of multiple TALE activators can selectively activate certain cellular genes in inaccessible chromatin regions. In this study, we aimed to interrogate the activation potential of TALEs upon transcriptionally silenced immune genes in the context of non-immune cells. We designed a unique strategy, in which a single TALE fused to the TATA-box binding protein (TBP-TALE) is coupled with multiple VP64-TALE activators. We found that our strategy is significantly more potent than multiple TALE activators alone in activating expression of IL-2 and GM-CSF in diverse cell origins in which both genes are otherwise completely silenced. Chromatin analysis revealed that the gene activation was due in part to displacement of a distinctly positioned nucleosome. These studies provide a novel epigenetic mechanism for artificial gene induction and have important implications for targeted cancer immunotherapy, DNA vaccine development, as well as rational design of TALE activators.

  20. Activation of Silenced Cytokine Gene Promoters by the Synergistic Effect of TBP-TALE and VP64-TALE Activators

    PubMed Central

    Anthony, Kim; More, Abhijit; Zhang, Xiaoliu

    2014-01-01

    Recent work has shown that the combinatorial use of multiple TALE activators can selectively activate certain cellular genes in inaccessible chromatin regions. In this study, we aimed to interrogate the activation potential of TALEs upon transcriptionally silenced immune genes in the context of non-immune cells. We designed a unique strategy, in which a single TALE fused to the TATA-box binding protein (TBP-TALE) is coupled with multiple VP64-TALE activators. We found that our strategy is significantly more potent than multiple TALE activators alone in activating expression of IL-2 and GM-CSF in diverse cell origins in which both genes are otherwise completely silenced. Chromatin analysis revealed that the gene activation was due in part to displacement of a distinctly positioned nucleosome. These studies provide a novel epigenetic mechanism for artificial gene induction and have important implications for targeted cancer immunotherapy, DNA vaccine development, as well as rational design of TALE activators. PMID:24755922

  1. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline

    PubMed Central

    2013-01-01

    Background As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. Results We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS A : DE genes with non-zero effect sizes in all studies, (2) HS B : DE genes with non-zero effect sizes in one or more studies and (3) HS r : DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. Conclusions The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS A , HS B , and HS r ). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author’s publication website. PMID:24359104

  2. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline.

    PubMed

    Chang, Lun-Ching; Lin, Hui-Min; Sibille, Etienne; Tseng, George C

    2013-12-21

    As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS(A): DE genes with non-zero effect sizes in all studies, (2) HS(B): DE genes with non-zero effect sizes in one or more studies and (3) HS(r): DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS(A), HS(B), and HS(r)). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author's publication website.

  3. Gene expression profiling of bone marrow mesenchymal stem cells from Osteogenesis Imperfecta patients during osteoblast differentiation.

    PubMed

    Kaneto, Carla Martins; Pereira Lima, Patrícia S; Prata, Karen Lima; Dos Santos, Jane Lima; de Pina Neto, João Monteiro; Panepucci, Rodrigo Alexandre; Noushmehr, Houtan; Covas, Dimas Tadeu; de Paula, Francisco José Alburquerque; Silva, Wilson Araújo

    2017-06-01

    Mesenchymal stem cells (MSCs) are precursors present in adult bone marrow that are able to differentiate into osteoblasts, adipocytes and chondroblasts that have gained great importance as a source for cell therapy. Recently, a number of studies involving the analysis of gene expression of undifferentiated MSCs and of MSCs in the differentiation into multiple lineage processes were observed but there is no information concerning the gene expression of MSCs from Osteogenesis Imperfecta (OI) patients. Osteogenesis Imperfecta is characterized as a genetic disorder in which a generalized osteopenia leads to excessive bone fragility and severe bone deformities. The aim of this study was to analyze gene expression profile during osteogenic differentiation from BMMSCs (Bone Marrow Mesenchymal Stem Cells) obtained from patients with Osteogenesis Imperfecta and from control subjects. Bone marrow samples were collected from three normal subjects and five patients with OI. Mononuclear cells were isolated for obtaining mesenchymal cells that had been expanded until osteogenic differentiation was induced. RNA was harvested at seven time points during the osteogenic differentiation period (D0, D+1, D+2, D+7, D+12, D+17 and D+21). Gene expression analysis was performed by the microarray technique and identified several differentially expressed genes. Some important genes for osteoblast differentiation had lower expression in OI patients, suggesting a smaller commitment of these patient's MSCs with the osteogenic lineage. Other genes also had their differential expression confirmed by RT-qPCR. An increase in the expression of genes related to adipocytes was observed, suggesting an increase of adipogenic differentiation at the expense osteogenic differentiation. Copyright © 2017. Published by Elsevier Masson SAS.

  4. Understanding phylogenetic incongruence: lessons from phyllostomid bats

    PubMed Central

    Dávalos, Liliana M; Cirranello, Andrea L; Geisler, Jonathan H; Simmons, Nancy B

    2012-01-01

    All characters and trait systems in an organism share a common evolutionary history that can be estimated using phylogenetic methods. However, differential rates of change and the evolutionary mechanisms driving those rates result in pervasive phylogenetic conflict. These drivers need to be uncovered because mismatches between evolutionary processes and phylogenetic models can lead to high confidence in incorrect hypotheses. Incongruence between phylogenies derived from morphological versus molecular analyses, and between trees based on different subsets of molecular sequences has become pervasive as datasets have expanded rapidly in both characters and species. For more than a decade, evolutionary relationships among members of the New World bat family Phyllostomidae inferred from morphological and molecular data have been in conflict. Here, we develop and apply methods to minimize systematic biases, uncover the biological mechanisms underlying phylogenetic conflict, and outline data requirements for future phylogenomic and morphological data collection. We introduce new morphological data for phyllostomids and outgroups and expand previous molecular analyses to eliminate methodological sources of phylogenetic conflict such as taxonomic sampling, sparse character sampling, or use of different algorithms to estimate the phylogeny. We also evaluate the impact of biological sources of conflict: saturation in morphological changes and molecular substitutions, and other processes that result in incongruent trees, including convergent morphological and molecular evolution. Methodological sources of incongruence play some role in generating phylogenetic conflict, and are relatively easy to eliminate by matching taxa, collecting more characters, and applying the same algorithms to optimize phylogeny. The evolutionary patterns uncovered are consistent with multiple biological sources of conflict, including saturation in morphological and molecular changes, adaptive morphological convergence among nectar-feeding lineages, and incongruent gene trees. Applying methods to account for nucleotide sequence saturation reduces, but does not completely eliminate, phylogenetic conflict. We ruled out paralogy, lateral gene transfer, and poor taxon sampling and outgroup choices among the processes leading to incongruent gene trees in phyllostomid bats. Uncovering and countering the possible effects of introgression and lineage sorting of ancestral polymorphism on gene trees will require great leaps in genomic and allelic sequencing in this species-rich mammalian family. We also found evidence for adaptive molecular evolution leading to convergence in mitochondrial proteins among nectar-feeding lineages. In conclusion, the biological processes that generate phylogenetic conflict are ubiquitous, and overcoming incongruence requires better models and more data than have been collected even in well-studied organisms such as phyllostomid bats. PMID:22891620

  5. Non-Syndromic Recurrent Multiple Odontogenic Keratocysts: A Case Report

    PubMed Central

    Bartake, AR.; Shreekanth, NG.; Prabhu, S.; Gopalkrishnan, K.

    2011-01-01

    Odontogenic keratocysts (OKCs) are one of the most frequent features of nevoid basal cell carcinoma syndrome (NBS). It is linked with mutation in the PTCH gene. Partial expression of the gene may result in occurrence of only multiple recurring OKC. Our patient presented with nine cysts with multiple recurrences over a period of 11 years without any other manifestation of the syndrome. PMID:21998815

  6. Multiple abiotic stimuli are integrated in the regulation of rice gene expression under field conditions.

    PubMed

    Plessis, Anne; Hafemeister, Christoph; Wilkins, Olivia; Gonzaga, Zennia Jean; Meyer, Rachel Sarah; Pires, Inês; Müller, Christian; Septiningsih, Endang M; Bonneau, Richard; Purugganan, Michael

    2015-11-26

    Plants rely on transcriptional dynamics to respond to multiple climatic fluctuations and contexts in nature. We analyzed the genome-wide gene expression patterns of rice (Oryza sativa) growing in rainfed and irrigated fields during two distinct tropical seasons and determined simple linear models that relate transcriptomic variation to climatic fluctuations. These models combine multiple environmental parameters to account for patterns of expression in the field of co-expressed gene clusters. We examined the similarities of our environmental models between tropical and temperate field conditions, using previously published data. We found that field type and macroclimate had broad impacts on transcriptional responses to environmental fluctuations, especially for genes involved in photosynthesis and development. Nevertheless, variation in solar radiation and temperature at the timescale of hours had reproducible effects across environmental contexts. These results provide a basis for broad-based predictive modeling of plant gene expression in the field.

  7. Role of the horizontal gene exchange in evolution of pathogenic Mycobacteria.

    PubMed

    Reva, Oleg; Korotetskiy, Ilya; Ilin, Aleksandr

    2015-01-01

    Mycobacterium tuberculosis is one of the most dangerous human pathogens, the causative agent of tuberculosis. While this pathogen is considered as extremely clonal and resistant to horizontal gene exchange, there are many facts supporting the hypothesis that on the early stages of evolution the development of pathogenicity of ancestral Mtb has started with a horizontal acquisition of virulence factors. Episodes of infections caused by non-tuberculosis Mycobacteria reported worldwide may suggest a potential for new pathogens to appear. If so, what is the role of horizontal gene transfer in this process? Availing of accessibility of complete genomes sequences of multiple pathogenic, conditionally pathogenic and saprophytic Mycobacteria, a genome comparative study was performed to investigate the distribution of genomic islands among bacteria and identify ontological links between these mobile elements. It was shown that the ancient genomic islands from M. tuberculosis still may be rooted to the pool of mobile genetic vectors distributed among Mycobacteria. A frequent exchange of genes was observed between M. marinum and several saprophytic and conditionally pathogenic species. Among them M. avium was the most promiscuous species acquiring genetic materials from diverse origins. Recent activation of genetic vectors circulating among Mycobacteria potentially may lead to emergence of new pathogens from environmental and conditionally pathogenic Mycobacteria. The species which require monitoring are M. marinum and M. avium as they eagerly acquire genes from different sources and may become donors of virulence gene cassettes to other micro-organisms.

  8. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes

    PubMed Central

    2013-01-01

    Background Ribonucleotide reductase (RNR), the enzyme responsible for the formation of deoxyribonucleotides from ribonucleotides, is found in all domains of life and many viral genomes. RNRs are also amongst the most abundant genes identified in environmental metagenomes. This study focused on understanding the distribution, diversity, and evolution of RNRs in phages (viruses that infect bacteria). Hidden Markov Model profiles were used to analyze the proteins encoded by 685 completely sequenced double-stranded DNA phages and 22 environmental viral metagenomes to identify RNR homologs in cultured phages and uncultured viral communities, respectively. Results RNRs were identified in 128 phage genomes, nearly tripling the number of phages known to encode RNRs. Class I RNR was the most common RNR class observed in phages (70%), followed by class II (29%) and class III (28%). Twenty-eight percent of the phages contained genes belonging to multiple RNR classes. RNR class distribution varied according to phage type, isolation environment, and the host’s ability to utilize oxygen. The majority of the phages containing RNRs are Myoviridae (65%), followed by Siphoviridae (30%) and Podoviridae (3%). The phylogeny and genomic organization of phage and host RNRs reveal several distinct evolutionary scenarios involving horizontal gene transfer, co-evolution, and differential selection pressure. Several putative split RNR genes interrupted by self-splicing introns or inteins were identified, providing further evidence for the role of frequent genetic exchange. Finally, viral metagenomic data indicate that RNRs are prevalent and highly dynamic in uncultured viral communities, necessitating future research to determine the environmental conditions under which RNRs provide a selective advantage. Conclusions This comprehensive study describes the distribution, diversity, and evolution of RNRs in phage genomes and environmental viral metagenomes. The distinct distributions of specific RNR classes amongst phages, combined with the various evolutionary scenarios predicted from RNR phylogenies suggest multiple inheritance sources and different selective forces for RNRs in phages. This study significantly improves our understanding of phage RNRs, providing insight into the diversity and evolution of this important auxiliary metabolic gene as well as the evolution of phages in response to their bacterial hosts and environments. PMID:23391036

  9. A phylotranscriptomic backbone of the orb-weaving spider family Araneidae (Arachnida, Araneae) supported by multiple methodological approaches.

    PubMed

    Kallal, Robert J; Fernández, Rosa; Giribet, Gonzalo; Hormiga, Gustavo

    2018-04-07

    The orb-weaving spider family Araneidae is extremely diverse (>3100 spp.) and its members can be charismatic terrestrial arthropods, many of them recognizable by their iconic orbicular snare web, such as the common garden spiders. Despite considerable effort to better understand their backbone relationships based on multiple sources of data (morphological, behavioral and molecular), pervasive low support remains in recent studies. In addition, no overarching phylogeny of araneids is available to date, hampering further comparative work. In this study, we analyze the transcriptomes of 33 taxa, including 19 araneids - 12 of them new to this study - representing most of the core family lineages, to examine the relationships within the family using genomic-scale datasets resulting from various methodological treatments, namely ortholog selection and gene occupancy as a measure of matrix completion. Six matrices were constructed to assess these effects by varying orthology inference method and gene occupancy threshold. Orthology methods used are the benchmarking tool BUSCO and the tree-based method UPhO; three gene occupancy thresholds (45%, 65%, 85%) were used to assess the effect of missing data. Gene tree and species tree-based methods (including multi-species coalescent and concatenation approaches, as well as maximum likelihood and Bayesian inference) were used totalling 17 analytical treatments. The monophyly of Araneidae and the placement of core araneid lineages were supported, together with some previously unsound backbone divergences; these include high support for Zygiellinae as the earliest diverging subfamily (followed by Nephilinae), the placement of Gasteracanthinae as sister group to Cyclosa and close relatives, and close relationships between the Araneus + Neoscona clade and Cyrtophorinae + Argiopinae clade. Incongruences were relegated to short branches in the clade comprising Cyclosa and its close relatives. We found congruence between most of the completed analyses, with minimal topological effects from occupancy/missing data and orthology assessment. The resulting number of genes by certain combinations of orthology and occupancy thresholds being analyzed had the greatest effect on the resulting trees, with anomalous outcomes recovered from analysis of lower numbers of genes. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Changing the Game: Using Integrative Genomics to Probe Virulence Mechanisms of the Stem Rust Pathogen Puccinia graminis f. sp. tritici.

    PubMed

    Figueroa, Melania; Upadhyaya, Narayana M; Sperschneider, Jana; Park, Robert F; Szabo, Les J; Steffenson, Brian; Ellis, Jeff G; Dodds, Peter N

    2016-01-01

    The recent resurgence of wheat stem rust caused by new virulent races of Puccinia graminis f. sp. tritici (Pgt) poses a threat to food security. These concerns have catalyzed an extensive global effort toward controlling this disease. Substantial research and breeding programs target the identification and introduction of new stem rust resistance (Sr) genes in cultivars for genetic protection against the disease. Such resistance genes typically encode immune receptor proteins that recognize specific components of the pathogen, known as avirulence (Avr) proteins. A significant drawback to deploying cultivars with single Sr genes is that they are often overcome by evolution of the pathogen to escape recognition through alterations in Avr genes. Thus, a key element in achieving durable rust control is the deployment of multiple effective Sr genes in combination, either through conventional breeding or transgenic approaches, to minimize the risk of resistance breakdown. In this situation, evolution of pathogen virulence would require changes in multiple Avr genes in order to bypass recognition. However, choosing the optimal Sr gene combinations to deploy is a challenge that requires detailed knowledge of the pathogen Avr genes with which they interact and the virulence phenotypes of Pgt existing in nature. Identifying specific Avr genes from Pgt will provide screening tools to enhance pathogen virulence monitoring, assess heterozygosity and propensity for mutation in pathogen populations, and confirm individual Sr gene functions in crop varieties carrying multiple effective resistance genes. Toward this goal, much progress has been made in assembling a high quality reference genome sequence for Pgt, as well as a Pan-genome encompassing variation between multiple field isolates with diverse virulence spectra. In turn this has allowed prediction of Pgt effector gene candidates based on known features of Avr genes in other plant pathogens, including the related flax rust fungus. Upregulation of gene expression in haustoria and evidence for diversifying selection are two useful parameters to identify candidate Avr genes. Recently, we have also applied machine learning approaches to agnostically predict candidate effectors. Here, we review progress in stem rust pathogenomics and approaches currently underway to identify Avr genes recognized by wheat Sr genes.

  11. Accurate and fast multiple-testing correction in eQTL studies.

    PubMed

    Sul, Jae Hoon; Raj, Towfique; de Jong, Simone; de Bakker, Paul I W; Raychaudhuri, Soumya; Ophoff, Roel A; Stranger, Barbara E; Eskin, Eleazar; Han, Buhm

    2015-06-04

    In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  12. Constitutive Expression of Short Hairpin RNA in Vivo Triggers Buildup of Mature Hairpin Molecules

    PubMed Central

    Ahn, M.; Witting, S.R.; Ruiz, R.; Saxena, R.

    2011-01-01

    Abstract RNA interference (RNAi) has become the cornerstone technology for studying gene function in mammalian cells. In addition, it is a promising therapeutic treatment for multiple human diseases. Virus-mediated constitutive expression of short hairpin RNA (shRNA) has the potential to provide a permanent source of silencing molecules to tissues, and it is being devised as a strategy for the treatment of liver conditions such as hepatitis B and hepatitis C virus infection. Unintended interaction between silencing molecules and cellular components, leading to toxic effects, has been described in vitro. Despite the enormous interest in using the RNAi technology for in vivo applications, little is known about the safety of constitutively expressing shRNA for multiple weeks. Here we report the effects of in vivo shRNA expression, using helper-dependent adenoviral vectors. We show that gene-specific knockdown is maintained for at least 6 weeks after injection of 1 × 1011 viral particles. Nonetheless, accumulation of mature shRNA molecules was observed up to weeks 3 and 4, and then declined gradually, suggesting the buildup of mature shRNA molecules induced cell death with concomitant loss of viral DNA and shRNA expression. No evidence of well-characterized innate immunity activation (such as interferon production) or saturation of the exportin-5 pathway was observed. Overall, our data suggest constitutive expression of shRNA results in accumulation of mature shRNA molecules, inducing cellular toxicity at late time points, despite the presence of gene silencing. PMID:21780944

  13. Breed relationships facilitate fine-mapping studies: A 7.8-kb deletion cosegregates with Collie eye anomaly across multiple dog breeds

    PubMed Central

    Parker, Heidi G.; Kukekova, Anna V.; Akey, Dayna T.; Goldstein, Orly; Kirkness, Ewen F.; Baysac, Kathleen C.; Mosher, Dana S.; Aguirre, Gustavo D.; Acland, Gregory M.; Ostrander, Elaine A.

    2007-01-01

    The features of modern dog breeds that increase the ease of mapping common diseases, such as reduced heterogeneity and extensive linkage disequilibrium, may also increase the difficulty associated with fine mapping and identifying causative mutations. One way to address this problem is by combining data from multiple breeds segregating the same trait after initial linkage has been determined. The multibreed approach increases the number of potentially informative recombination events and reduces the size of the critical haplotype by taking advantage of shortened linkage disequilibrium distances found across breeds. In order to identify breeds that likely share a trait inherited from the same ancestral source, we have used cluster analysis to divide 132 breeds of dog into five primary breed groups. We then use the multibreed approach to fine-map Collie eye anomaly (cea), a complex disorder of ocular development that was initially mapped to a 3.9-cM region on canine chromosome 37. Combined genotypes from affected individuals from four breeds of a single breed group significantly narrowed the candidate gene region to a 103-kb interval spanning only four genes. Sequence analysis revealed that all affected dogs share a homozygous deletion of 7.8 kb in the NHEJ1 gene. This intronic deletion spans a highly conserved binding domain to which several developmentally important proteins bind. This work both establishes that the primary cea mutation arose as a single disease allele in a common ancestor of herding breeds as well as highlights the value of comparative population analysis for refining regions of linkage. PMID:17916641

  14. Boosting Probabilistic Graphical Model Inference by Incorporating Prior Knowledge from Multiple Sources

    PubMed Central

    Praveen, Paurush; Fröhlich, Holger

    2013-01-01

    Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available. PMID:23826291

  15. Gene-gene-environment interactions between drugs, transporters, receptors, and metabolizing enzymes: Statins, SLCO1B1, and CYP3A4 as an example.

    PubMed

    Sadee, Wolfgang

    2013-09-01

    Pharmacogenetic biomarker tests include mostly specific single gene-drug pairs, capable of accounting for a portion of interindividual variability in drug response and toxicity. However, multiple genes are likely to contribute, either acting independently or epistatically, with the CYP2C9-VKORC1-warfarin test panel, an example of a clinically used gene-gene-dug interaction. I discuss here further instances of gene-gene-drug interactions, including a proposed dynamic effect on statin therapy by genetic variants in both a transporter (SLCO1B1) and a metabolizing enzyme (CYP3A4) in liver cells, the main target site where statins block cholesterol synthesis. These examples set a conceptual framework for developing diagnostic panels involving multiple gene-drug combinations. Copyright © 2013 Wiley Periodicals, Inc.

  16. The construction of cDNA library and the screening of related antigen of ascitic tumor cells of ovarian cancer.

    PubMed

    Hou, Q; Chen, K; Shan, Z

    2015-01-01

    To construct the cDNA library of the ascites tumor cells of ovarian cancer, which can be used to screen the related antigen for the early diagnosis of ovarian cancer and therapeutic targets of immune treatment. Four cases of ovarian serous cystadenocarcinoma, two cases of ovarian mucinous cystadenocarcinoma, and two cases of ovarian endometrial carcinoma in patients with ascitic tumor cells which were used to construct the cDNA library. To screen the ovarian cancer antigen gene, evaluate the enzyme, and analyze nucleotide sequence, serological analysis of recombinant tumor cDNA expression libraries (SEREX) and suppression subtractive hybridization technique (SSH) techniques were utilized. The detection method of recombinant expression-based serological mini-arrays (SMARTA) was used to detect the ovarian cancer antigen and the positive reaction of 105 cases of ovarian cancer patients and 105 normal women's autoantibodies correspondingly in serum. After two rounds of serologic screening and glycosides sequencing analysis, 59 candidates of ovarian cancer antigen gene fragments were finally identified, which corresponded to 50 genes. They were then divided into six categories: (1) the homologous genes which related to the known ovarian cancer genes, such as BARD 1 gene, etc; (2) the homologous genes which were associated with other tumors, such as TM4SFI gene, etc; (3) the genes which were expressed in a special organization, such as ILF3, FXR1 gene, etc; (4) the genes which were the same with some protein genes of special function, such as TIZ, ClD gene; (5) the homologous genes which possessed the same source with embryonic genes, such as PKHD1 gene, etc; (6) the remaining genes were the unknown genes without the homologous sequence in the gene pool, such as OV-189 genes. SEREX technology combined with SSH method is an effective research strategy which can filter tumor antigen with high specific character; the corresponding autoantibodies of TM4SFl, ClD, TIZ, BARDI, FXRI, and OV-189 gene's recombinant antigen in serum can be regarded as the biomarkers which are used to diagnose ovarian cancer. The combination of multiple antigen detection can improve diagnostic efficiency.

  17. TGFβ Pathway Inhibition Redifferentiates Human Pancreatic Islet β Cells Expanded In Vitro

    PubMed Central

    Toren-Haritan, Ginat; Efrat, Shimon

    2015-01-01

    In-vitro expansion of insulin-producing cells from adult human pancreatic islets could provide an abundant cell source for diabetes therapy. However, proliferation of β-cell-derived (BCD) cells is associated with loss of phenotype and epithelial-mesenchymal transition (EMT). Nevertheless, BCD cells maintain open chromatin structure at β-cell genes, suggesting that they could be readily redifferentiated. The transforming growth factor β (TGFβ) pathway has been implicated in EMT in a range of cell types. Here we show that human islet cell expansion in vitro involves upregulation of the TGFβ pathway. Blocking TGFβ pathway activation using short hairpin RNA (shRNA) against TGFβ Receptor 1 (TGFBR1, ALK5) transcripts inhibits BCD cell proliferation and dedifferentiation. Treatment of expanded BCD cells with ALK5 shRNA results in their redifferentiation, as judged by expression of β-cell genes and decreased cell proliferation. These effects, which are reproducible in cells from multiple human donors, are mediated, at least in part, by AKT-FOXO1 signaling. ALK5 inhibition synergizes with a soluble factor cocktail to promote BCD cell redifferentiation. The combined treatment may offer a therapeutically applicable way for generating an abundant source of functional insulin-producing cells following ex-vivo expansion. PMID:26418361

  18. Carbon: Nitrogen Interaction Regulates Expression of Genes Involved in N-Uptake and Assimilation in Brassica juncea L.

    PubMed Central

    Goel, Parul; Bhuria, Monika; Kaushal, Mamta

    2016-01-01

    In plants, several cellular and metabolic pathways interact with each other to regulate processes that are vital for their growth and development. Carbon (C) and Nitrogen (N) are two main nutrients for plants and coordination of C and N pathways is an important factor for maintaining plant growth and development. In the present work, influence of nitrogen and sucrose (C source) on growth parameters and expression of genes involved in nitrogen transport and assimilatory pathways was studied in B. juncea seedlings. For this, B. juncea seedlings were treated with four combinations of C and N source viz., N source alone (-Suc+N), C source alone (+Suc-N), with N and C source (+Suc+N) or without N and C source (-Suc-N). Cotyledon size and shoot length were found to be increased in seedlings, when nitrogen alone was present in the medium. Distinct expression pattern of genes in both, root and shoot tissues was observed in response to exogenously supplied N and C. The presence or depletion of nitrogen alone in the medium leads to severe up- or down-regulation of key genes involved in N-uptake and transport (BjNRT1.1, BjNRT1.8) in root tissue and genes involved in nitrate reduction (BjNR1 and BjNR2) in shoot tissue. Moreover, expression of several genes, like BjAMT1.2, BjAMT2 and BjPK in root and two genes BjAMT2 and BjGS1.1 in shoot were found to be regulated only when C source was present in the medium. Majority of genes were found to respond in root and shoot tissues, when both C and N source were present in the medium, thus reflecting their importance as a signal in regulating expression of genes involved in N-uptake and assimilation. The present work provides insight into the regulation of genes of N-uptake and assimilatory pathway in B. juncea by interaction of both carbon and nitrogen. PMID:27637072

  19. Constraints on genes shape long-term conservation of macro-synteny in metazoan genomes.

    PubMed

    Lv, Jie; Havlak, Paul; Putnam, Nicholas H

    2011-10-05

    Many metazoan genomes conserve chromosome-scale gene linkage relationships ("macro-synteny") from the common ancestor of multicellular animal life 1234, but the biological explanation for this conservation is still unknown. Double cut and join (DCJ) is a simple, well-studied model of neutral genome evolution amenable to both simulation and mathematical analysis 5, but as we show here, it is not sufficent to explain long-term macro-synteny conservation. We examine a family of simple (one-parameter) extensions of DCJ to identify models and choices of parameters consistent with the levels of macro- and micro-synteny conservation observed among animal genomes. Our software implements a flexible strategy for incorporating genomic context into the DCJ model to incorporate various types of genomic context ("DCJ-[C]"), and is available as open source software from http://github.com/putnamlab/dcj-c. A simple model of genome evolution, in which DCJ moves are allowed only if they maintain chromosomal linkage among a set of constrained genes, can simultaneously account for the level of macro-synteny conservation and for correlated conservation among multiple pairs of species. Simulations under this model indicate that a constraint on approximately 7% of metazoan genes is sufficient to constrain genome rearrangement to an average rate of 25 inversions and 1.7 translocations per million years.

  20. Origin and Evolution of Allopolyploid Wheatgrass Elymus fibrosus (Schrenk) Tzvelev (Poaceae: Triticeae) Reveals the Effect of Its Origination on Genetic Diversity

    PubMed Central

    Gu, Hai-Lan; Wu, Pan-Pan; Yi, Xu; Wang, Wei-Jie; Shi, Han-Feng; Wu, De-Xiang; Sun, Genlou

    2016-01-01

    Origin and evolution of tetraploid Elymus fibrosus (Schrenk) Tzvelev were characterized using low-copy nuclear gene Rpb2 (the second largest subunit of RNA polymerase II), and chloroplast region trnL–trnF (spacer between the tRNA Leu (UAA) gene and the tRNA-Phe (GAA) gene). Ten accessions of E. fibrosus along with 19 Elymus species with StH genomic constitution and diploid species in the tribe Triticeae were analyzed. Chloroplast trnL–trnF sequence data suggested that Pseudoroegneria (St genome) was the maternal donor of E. fibrosus. Rpb2 data confirmed the presence of StH genomes in E. fibrosus, and suggested that St and H genomes in E. fibrosus each is more likely originated from single gene pool. Single origin of E. fibrosus might be one of the reasons causing genetic diversity in E. fibrosus lower than those in E. caninus and E. trachycaulus, which have similar ecological preferences and breeding systems with E. fibrosus, and each was originated from multiple sources. Convergent evolution of St and H copy Rpb2 sequences in some accessions of E. fibrosus might have occurred during the evolutionary history of this allotetraploid. PMID:27936163

  1. Common Variation in the DOPA Decarboxylase (DDC) Gene and Human Striatal DDC Activity In Vivo.

    PubMed

    Eisenberg, Daniel P; Kohn, Philip D; Hegarty, Catherine E; Ianni, Angela M; Kolachana, Bhaskar; Gregory, Michael D; Masdeu, Joseph C; Berman, Karen F

    2016-08-01

    The synthesis of multiple amine neurotransmitters, such as dopamine, norepinephrine, serotonin, and trace amines, relies in part on DOPA decarboxylase (DDC, AADC), an enzyme that is required for normative neural operations. Because rare, loss-of-function mutations in the DDC gene result in severe enzymatic deficiency and devastating autonomic, motor, and cognitive impairment, DDC common genetic polymorphisms have been proposed as a source of more moderate, but clinically important, alterations in DDC function that may contribute to risk, course, or treatment response in complex, heritable neuropsychiatric illnesses. However, a direct link between common genetic variation in DDC and DDC activity in the living human brain has never been established. We therefore tested for this association by conducting extensive genotyping across the DDC gene in a large cohort of 120 healthy individuals, for whom DDC activity was then quantified with [(18)F]-FDOPA positron emission tomography (PET). The specific uptake constant, Ki, a measure of DDC activity, was estimated for striatal regions of interest and found to be predicted by one of five tested haplotypes, particularly in the ventral striatum. These data provide evidence for cis-acting, functional common polymorphisms in the DDC gene and support future work to determine whether such variation might meaningfully contribute to DDC-mediated neural processes relevant to neuropsychiatric illness and treatment.

  2. On the possibility of the multiple inductively coupled plasma and helicon plasma sources for large-area processes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Jin-Won; Lee, Yun-Seong, E-mail: leeeeys@kaist.ac.kr; Chang, Hong-Young

    2014-08-15

    In this study, we attempted to determine the possibility of multiple inductively coupled plasma (ICP) and helicon plasma sources for large-area processes. Experiments were performed with the one and two coils to measure plasma and electrical parameters, and a circuit simulation was performed to measure the current at each coil in the 2-coil experiment. Based on the result, we could determine the possibility of multiple ICP sources due to a direct change of impedance due to current and saturation of impedance due to the skin-depth effect. However, a helicon plasma source is difficult to adapt to the multiple sources duemore » to the consistent change of real impedance due to mode transition and the low uniformity of the B-field confinement. As a result, it is expected that ICP can be adapted to multiple sources for large-area processes.« less

  3. ABC transporters and the proteasome complex are implicated in susceptibility to Stevens-Johnson syndrome and toxic epidermal necrolysis across multiple drugs.

    PubMed

    Nicoletti, Paola; Bansal, Mukesh; Lefebvre, Celine; Guarnieri, Paolo; Shen, Yufeng; Pe'er, Itsik; Califano, Andrea; Floratos, Aris

    2015-01-01

    Stevens-Johnson syndrome (SJS) and Toxic Epidermal Necrolysis (TEN) represent rare but serious adverse drug reactions (ADRs). Both are characterized by distinctive blistering lesions and significant mortality rates. While there is evidence for strong drug-specific genetic predisposition related to HLA alleles, recent genome wide association studies (GWAS) on European and Asian populations have failed to identify genetic susceptibility alleles that are common across multiple drugs. We hypothesize that this is a consequence of the low to moderate effect size of individual genetic risk factors. To test this hypothesis we developed Pointer, a new algorithm that assesses the aggregate effect of multiple low risk variants on a pathway using a gene set enrichment approach. A key advantage of our method is the capability to associate SNPs with genes by exploiting physical proximity as well as by using expression quantitative trait loci (eQTLs) that capture information about both cis- and trans-acting regulatory effects. We control for known bias-inducing aspects of enrichment based analyses, such as: 1) gene length, 2) gene set size, 3) presence of biologically related genes within the same linkage disequilibrium (LD) region, and, 4) genes shared among multiple gene sets. We applied this approach to publicly available SJS/TEN genome-wide genotype data and identified the ABC transporter and Proteasome pathways as potentially implicated in the genetic susceptibility of non-drug-specific SJS/TEN. We demonstrated that the innovative SNP-to-gene mapping phase of the method was essential in detecting the significant enrichment for those pathways. Analysis of an independent gene expression dataset provides supportive functional evidence for the involvement of Proteasome pathways in SJS/TEN cutaneous lesions. These results suggest that Pointer provides a useful framework for the integrative analysis of pharmacogenetic GWAS data, by increasing the power to detect aggregate effects of multiple low risk variants. The software is available for download at https://sourceforge.net/projects/pointergsa/.

  4. No evidence for the use of DIR, D–D fusions, chromosome 15 open reading frames or VHreplacement in the peripheral repertoire was found on application of an improved algorithm, JointML, to 6329 human immunoglobulin H rearrangements

    PubMed Central

    Ohm-Laursen, Line; Nielsen, Morten; Larsen, Stine R; Barington, Torben

    2006-01-01

    Antibody diversity is created by imprecise joining of the variability (V), diversity (D) and joining (J) gene segments of the heavy and light chain loci. Analysis of rearrangements is complicated by somatic hypermutations and uncertainty concerning the sources of gene segments and the precise way in which they recombine. It has been suggested that D genes with irregular recombination signal sequences (DIR) and chromosome 15 open reading frames (OR15) can replace conventional D genes, that two D genes or inverted D genes may be used and that the repertoire can be further diversified by heavy chain V gene (VH) replacement. Safe conclusions require large, well-defined sequence samples and algorithms minimizing stochastic assignment of segments. Two computer programs were developed for analysis of heavy chain joints. JointHMM is a profile hidden Markow model, while JointML is a maximum-likelihood-based method taking the lengths of the joint and the mutational status of the VH gene into account. The programs were applied to a set of 6329 clonally unrelated rearrangements. A conventional D gene was found in 80% of unmutated sequences and 64% of mutated sequences, while D-gene assignment was kept below 5% in artificial (randomly permutated) rearrangements. No evidence for the use of DIR, OR15, multiple D genes or VH replacements was found, while inverted D genes were used in less than 1‰ of the sequences. JointML was shown to have a higher predictive performance for D-gene assignment in mutated and unmutated sequences than four other publicly available programs. An online version 1·0 of JointML is available at http://www.cbs.dtu.dk/services/VDJsolver. PMID:17005006

  5. Development of 25 near-isogenic lines (NILs) with ten BPH resistance genes in rice (Oryza sativa L.): production, resistance spectrum, and molecular analysis.

    PubMed

    Jena, Kshirod K; Hechanova, Sherry Lou; Verdeprado, Holden; Prahalada, G D; Kim, Sung-Ryul

    2017-11-01

    A first set of 25 NILs carrying ten BPH resistance genes and their pyramids was developed in the background of indica variety IR24 for insect resistance breeding in rice. Brown planthopper (Nilaparvata lugens Stal.) is one of the most destructive insect pests in rice. Development of near-isogenic lines (NILs) is an important strategy for genetic analysis of brown planthopper (BPH) resistance (R) genes and their deployment against diverse BPH populations. A set of 25 NILs with 9 single R genes and 16 multiple R gene combinations consisting of 11 two-gene pyramids and 5 three-gene pyramids in the genetic background of the susceptible indica rice cultivar IR24 was developed through marker-assisted selection. The linked DNA markers for each of the R genes were used for foreground selection and confirming the introgressed regions of the BPH R genes. Modified seed box screening and feeding rate of BPH were used to evaluate the spectrum of resistance. BPH reaction of each of the NILs carrying different single genes was variable at the antibiosis level with the four BPH populations of the Philippines. The NILs with two- to three-pyramided genes showed a stronger level of antibiosis (49.3-99.0%) against BPH populations compared with NILs with a single R gene NILs (42.0-83.5%) and IR24 (10.0%). Background genotyping by high-density SNPs markers revealed that most of the chromosome regions of the NILs (BC 3 F 5 ) had IR24 genome recovery of 82.0-94.2%. Six major agronomic data of the NILs showed a phenotypically comparable agronomic performance with IR24. These newly developed NILs will be useful as new genetic resources for BPH resistance breeding and are valuable sources of genes in monitoring against the emerging BPH biotypes in different rice-growing countries.

  6. Comparison of Sewage and Animal Fecal Microbiomes by Using Oligotyping Reveals Potential Human Fecal Indicators in Multiple Taxonomic Groups

    PubMed Central

    Fisher, Jenny C.; Eren, A. Murat; Green, Hyatt C.; Shanks, Orin C.; Morrison, Hilary G.; Vineis, Joseph H.; Sogin, Mitchell L.

    2015-01-01

    Most DNA-based microbial source tracking (MST) approaches target host-associated organisms within the order Bacteroidales, but the gut microbiota of humans and other animals contain organisms from an array of other taxonomic groups that might provide indicators of fecal pollution sources. To discern between human and nonhuman fecal sources, we compared the V6 regions of the 16S rRNA genes detected in fecal samples from six animal hosts to those found in sewage (as a proxy for humans). We focused on 10 abundant genera and used oligotyping, which can detect subtle differences between rRNA gene sequences from ecologically distinct organisms. Our analysis showed clear patterns of differential oligotype distributions between sewage and animal samples. Over 100 oligotypes of human origin occurred preferentially in sewage samples, and 99 human oligotypes were sewage specific. Sequences represented by the sewage-specific oligotypes can be used individually for development of PCR-based assays or together with the oligotypes preferentially associated with sewage to implement a signature-based approach. Analysis of sewage from Spain and Brazil showed that the sewage-specific oligotypes identified in U.S. sewage have the potential to be used as global alternative indicators of human fecal pollution. Environmental samples with evidence of prior human fecal contamination had consistent ratios of sewage signature oligotypes that corresponded to the trends observed for sewage. Our methodology represents a promising approach to identifying new bacterial taxa for MST applications and further highlights the potential of the family Lachnospiraceae to provide human-specific markers. In addition to source tracking applications, the patterns of the fine-scale population structure within fecal taxa suggest a fundamental relationship between bacteria and their hosts. PMID:26231648

  7. Zoonotic Potential of Escherichia coli Isolates from Retail Chicken Meat Products and Eggs

    PubMed Central

    Mitchell, Natalie M.; Johnson, James R.; Johnston, Brian; Curtiss, Roy

    2014-01-01

    Chicken products are suspected as a source of extraintestinal pathogenic Escherichia coli (ExPEC), which causes diseases in humans. The zoonotic risk to humans from chicken-source E. coli is not fully elucidated. To clarify the zoonotic risk posed by ExPEC in chicken products and to fill existing knowledge gaps regarding ExPEC zoonosis, we evaluated the prevalence of ExPEC on shell eggs and compared virulence-associated phenotypes between ExPEC and non-ExPEC isolates from both chicken meat and eggs. The prevalence of ExPEC among egg-source isolates was low, i.e., 5/108 (4.7%). Based on combined genotypic and phenotypic screening results, multiple human and avian pathotypes were represented among the chicken-source ExPEC isolates, including avian-pathogenic E. coli (APEC), uropathogenic E. coli (UPEC), neonatal meningitis E. coli (NMEC), and sepsis-associated E. coli (SEPEC), as well as an undefined ExPEC group, which included isolates with fewer virulence factors than the APEC, UPEC, and NMEC isolates. These findings document a substantial prevalence of human-pathogenic ExPEC-associated genes and phenotypes among E. coli isolates from retail chicken products and identify key virulence traits that could be used for screening. PMID:25480753

  8. Pediatric Multiple Sclerosis: Genes, Environment, and a Comprehensive Therapeutic Approach.

    PubMed

    Cappa, Ryan; Theroux, Liana; Brenton, J Nicholas

    2017-10-01

    Pediatric multiple sclerosis is an increasingly recognized and studied disorder that accounts for 3% to 10% of all patients with multiple sclerosis. The risk for pediatric multiple sclerosis is thought to reflect a complex interplay between environmental and genetic risk factors. Environmental exposures, including sunlight (ultraviolet radiation, vitamin D levels), infections (Epstein-Barr virus), passive smoking, and obesity, have been identified as potential risk factors in youth. Genetic predisposition contributes to the risk of multiple sclerosis, and the major histocompatibility complex on chromosome 6 makes the single largest contribution to susceptibility to multiple sclerosis. With the use of large-scale genome-wide association studies, other non-major histocompatibility complex alleles have been identified as independent risk factors for the disease. The bridge between environment and genes likely lies in the study of epigenetic processes, which are environmentally-influenced mechanisms through which gene expression may be modified. This article will review these topics to provide a framework for discussion of a comprehensive approach to counseling and ultimately treating the pediatric patient with multiple sclerosis. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Modeling the functional genomics of autism using human neurons.

    PubMed

    Konopka, G; Wexler, E; Rosen, E; Mukamel, Z; Osborn, G E; Chen, L; Lu, D; Gao, F; Gao, K; Lowe, J K; Geschwind, D H

    2012-02-01

    Human neural progenitors from a variety of sources present new opportunities to model aspects of human neuropsychiatric disease in vitro. Such in vitro models provide the advantages of a human genetic background combined with rapid and easy manipulation, making them highly useful adjuncts to animal models. Here, we examined whether a human neuronal culture system could be utilized to assess the transcriptional program involved in human neural differentiation and to model some of the molecular features of a neurodevelopmental disorder, such as autism. Primary normal human neuronal progenitors (NHNPs) were differentiated into a post-mitotic neuronal state through addition of specific growth factors and whole-genome gene expression was examined throughout a time course of neuronal differentiation. After 4 weeks of differentiation, a significant number of genes associated with autism spectrum disorders (ASDs) are either induced or repressed. This includes the ASD susceptibility gene neurexin 1, which showed a distinct pattern from neurexin 3 in vitro, and which we validated in vivo in fetal human brain. Using weighted gene co-expression network analysis, we visualized the network structure of transcriptional regulation, demonstrating via this unbiased analysis that a significant number of ASD candidate genes are coordinately regulated during the differentiation process. As NHNPs are genetically tractable and manipulable, they can be used to study both the effects of mutations in multiple ASD candidate genes on neuronal differentiation and gene expression in combination with the effects of potential therapeutic molecules. These data also provide a step towards better understanding of the signaling pathways disrupted in ASD.

  10. Evaluation of Electroencephalography Source Localization Algorithms with Multiple Cortical Sources.

    PubMed

    Bradley, Allison; Yao, Jun; Dewald, Jules; Richter, Claus-Peter

    2016-01-01

    Source localization algorithms often show multiple active cortical areas as the source of electroencephalography (EEG). Yet, there is little data quantifying the accuracy of these results. In this paper, the performance of current source density source localization algorithms for the detection of multiple cortical sources of EEG data has been characterized. EEG data were generated by simulating multiple cortical sources (2-4) with the same strength or two sources with relative strength ratios of 1:1 to 4:1, and adding noise. These data were used to reconstruct the cortical sources using current source density (CSD) algorithms: sLORETA, MNLS, and LORETA using a p-norm with p equal to 1, 1.5 and 2. Precision (percentage of the reconstructed activity corresponding to simulated activity) and Recall (percentage of the simulated sources reconstructed) of each of the CSD algorithms were calculated. While sLORETA has the best performance when only one source is present, when two or more sources are present LORETA with p equal to 1.5 performs better. When the relative strength of one of the sources is decreased, all algorithms have more difficulty reconstructing that source. However, LORETA 1.5 continues to outperform other algorithms. If only the strongest source is of interest sLORETA is recommended, while LORETA with p equal to 1.5 is recommended if two or more of the cortical sources are of interest. These results provide guidance for choosing a CSD algorithm to locate multiple cortical sources of EEG and for interpreting the results of these algorithms.

  11. Evaluation of Electroencephalography Source Localization Algorithms with Multiple Cortical Sources

    PubMed Central

    Bradley, Allison; Yao, Jun; Dewald, Jules; Richter, Claus-Peter

    2016-01-01

    Background Source localization algorithms often show multiple active cortical areas as the source of electroencephalography (EEG). Yet, there is little data quantifying the accuracy of these results. In this paper, the performance of current source density source localization algorithms for the detection of multiple cortical sources of EEG data has been characterized. Methods EEG data were generated by simulating multiple cortical sources (2–4) with the same strength or two sources with relative strength ratios of 1:1 to 4:1, and adding noise. These data were used to reconstruct the cortical sources using current source density (CSD) algorithms: sLORETA, MNLS, and LORETA using a p-norm with p equal to 1, 1.5 and 2. Precision (percentage of the reconstructed activity corresponding to simulated activity) and Recall (percentage of the simulated sources reconstructed) of each of the CSD algorithms were calculated. Results While sLORETA has the best performance when only one source is present, when two or more sources are present LORETA with p equal to 1.5 performs better. When the relative strength of one of the sources is decreased, all algorithms have more difficulty reconstructing that source. However, LORETA 1.5 continues to outperform other algorithms. If only the strongest source is of interest sLORETA is recommended, while LORETA with p equal to 1.5 is recommended if two or more of the cortical sources are of interest. These results provide guidance for choosing a CSD algorithm to locate multiple cortical sources of EEG and for interpreting the results of these algorithms. PMID:26809000

  12. Multiple Sources of Prescription Payment and Risky Opioid Therapy Among Veterans.

    PubMed

    Becker, William C; Fenton, Brenda T; Brandt, Cynthia A; Doyle, Erin L; Francis, Joseph; Goulet, Joseph L; Moore, Brent A; Torrise, Virginia; Kerns, Robert D; Kreiner, Peter W

    2017-07-01

    Opioid overdose and other related harms are a major source of morbidity and mortality among US Veterans, in part due to high-risk opioid prescribing. We sought to determine whether having multiple sources of payment for opioids-as a marker for out-of-system access-is associated with risky opioid therapy among veterans. Cross-sectional study examining the association between multiple sources of payment and risky opioid therapy among all individuals with Veterans Health Administration (VHA) payment for opioid analgesic prescriptions in Kentucky during fiscal year 2014-2015. Source of payment categories: (1) VHA only source of payment (sole source); (2) sources of payment were VHA and at least 1 cash payment [VHA+cash payment(s)] whether or not there was a third source of payment; and (3) at least one other noncash source: Medicare, Medicaid, or private insurance [VHA+noncash source(s)]. Our outcomes were 2 risky opioid therapies: combination opioid/benzodiazepine therapy and high-dose opioid therapy, defined as morphine equivalent daily dose ≥90 mg. Of the 14,795 individuals in the analytic sample, there were 81.9% in the sole source category, 6.6% in the VHA+cash payment(s) category, and 11.5% in the VHA+noncash source(s) category. In logistic regression, controlling for age and sex, persons with multiple payment sources had significantly higher odds of each risky opioid therapy, with those in the VHA+cash having significantly higher odds than those in the VHA+noncash source(s) group. Prescribers should examine the prescription monitoring program as multiple payment sources increase the odds of risky opioid therapy.

  13. Circadian Enhancers Coordinate Multiple Phases of Rhythmic Gene Transcription In Vivo

    PubMed Central

    Fang, Bin; Everett, Logan J.; Jager, Jennifer; Briggs, Erika; Armour, Sean M.; Feng, Dan; Roy, Ankur; Gerhart-Hines, Zachary; Sun, Zheng; Lazar, Mitchell A.

    2014-01-01

    SUMMARY Mammalian transcriptomes display complex circadian rhythms with multiple phases of gene expression that cannot be accounted for by current models of the molecular clock. We have determined the underlying mechanisms by measuring nascent RNA transcription around the clock in mouse liver. Unbiased examination of eRNAs that cluster in specific circadian phases identified functional enhancers driven by distinct transcription factors (TFs). We further identify on a global scale the components of the TF cistromes that function to orchestrate circadian gene expression. Integrated genomic analyses also revealed novel mechanisms by which a single circadian factor controls opposing transcriptional phases. These findings shed new light on the diversity and specificity of TF function in the generation of multiple phases of circadian gene transcription in a mammalian organ. PMID:25416951

  14. Association of Common Mitochondrial DNA Variants with Multiple Sclerosis and Systemic Lupus Erythematosus

    PubMed Central

    Vyshkina, Tamara; Sylvester, Andrew; Sadiq, Saud; Bonilla, Eduardo; Canter, Jeff A.; Perl, Andras; Kalman, Bernadette

    2008-01-01

    Mitochondrial dysfunction has been implicated in the pathogenesis of multiple sclerosis (MS) and systemic lupus erythematosus (SLE). This study re-investigates the roles of previously suggested candidate genes of energy metabolism (Complex I genes located in the nucleus and in the mitochondria) in patients with MS relative to ethnically matched SLE patients and healthy controls. After stringent correction for multiple testing, we reproduce the association of the mitochondrial (mt)DNA haplotype K* with MS, but reject the importance of previously suggested borderline associations with nuclear genes of Complex I. In addition, we detect the association of common variants of the mitochondrial ND2 and ATP6 genes with both MS and SLE, which raises the possibility of a shared mitochondrial genetic background of these two autoimmune diseases. PMID:18708297

  15. Circadian enhancers coordinate multiple phases of rhythmic gene transcription in vivo.

    PubMed

    Fang, Bin; Everett, Logan J; Jager, Jennifer; Briggs, Erika; Armour, Sean M; Feng, Dan; Roy, Ankur; Gerhart-Hines, Zachary; Sun, Zheng; Lazar, Mitchell A

    2014-11-20

    Mammalian transcriptomes display complex circadian rhythms with multiple phases of gene expression that cannot be accounted for by current models of the molecular clock. We have determined the underlying mechanisms by measuring nascent RNA transcription around the clock in mouse liver. Unbiased examination of enhancer RNAs (eRNAs) that cluster in specific circadian phases identified functional enhancers driven by distinct transcription factors (TFs). We further identify on a global scale the components of the TF cistromes that function to orchestrate circadian gene expression. Integrated genomic analyses also revealed mechanisms by which a single circadian factor controls opposing transcriptional phases. These findings shed light on the diversity and specificity of TF function in the generation of multiple phases of circadian gene transcription in a mammalian organ.

  16. Gene Ontology annotations at SGD: new data sources and annotation methods

    PubMed Central

    Hong, Eurie L.; Balakrishnan, Rama; Dong, Qing; Christie, Karen R.; Park, Julie; Binkley, Gail; Costanzo, Maria C.; Dwight, Selina S.; Engel, Stacia R.; Fisk, Dianna G.; Hirschman, Jodi E.; Hitz, Benjamin C.; Krieger, Cynthia J.; Livstone, Michael S.; Miyasato, Stuart R.; Nash, Robert S.; Oughtred, Rose; Skrzypek, Marek S.; Weng, Shuai; Wong, Edith D.; Zhu, Kathy K.; Dolinski, Kara; Botstein, David; Cherry, J. Michael

    2008-01-01

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) collects and organizes biological information about the chromosomal features and gene products of the budding yeast Saccharomyces cerevisiae. Although published data from traditional experimental methods are the primary sources of evidence supporting Gene Ontology (GO) annotations for a gene product, high-throughput experiments and computational predictions can also provide valuable insights in the absence of an extensive body of literature. Therefore, GO annotations available at SGD now include high-throughput data as well as computational predictions provided by the GO Annotation Project (GOA UniProt; http://www.ebi.ac.uk/GOA/). Because the annotation method used to assign GO annotations varies by data source, GO resources at SGD have been modified to distinguish data sources and annotation methods. In addition to providing information for genes that have not been experimentally characterized, GO annotations from independent sources can be compared to those made by SGD to help keep the literature-based GO annotations current. PMID:17982175

  17. Multiple-input multiple-output causal strategies for gene selection.

    PubMed

    Bontempi, Gianluca; Haibe-Kains, Benjamin; Desmedt, Christine; Sotiriou, Christos; Quackenbush, John

    2011-11-25

    Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization accuracy they often do not reveal direct causes. The latter is essentially related to the fact that high correlation (or relevance) does not imply causation. In this study, we show how to efficiently incorporate causal information into gene selection by moving from a single-input single-output to a multiple-input multiple-output setting. We show in synthetic case study that a better prioritization of causal variables can be obtained by considering a relevance score which incorporates a causal term. In addition we show, in a meta-analysis study of six publicly available breast cancer microarray datasets, that the improvement occurs also in terms of accuracy. The biological interpretation of the results confirms the potential of a causal approach to gene selection. Integrating causal information into gene selection algorithms is effective both in terms of prediction accuracy and biological interpretation.

  18. Identifying candidate genes for Type 2 Diabetes Mellitus and obesity through gene expression profiling in multiple tissues or cells.

    PubMed

    Chen, Junhui; Meng, Yuhuan; Zhou, Jinghui; Zhuo, Min; Ling, Fei; Zhang, Yu; Du, Hongli; Wang, Xiaoning

    2013-01-01

    Type 2 Diabetes Mellitus (T2DM) and obesity have become increasingly prevalent in recent years. Recent studies have focused on identifying causal variations or candidate genes for obesity and T2DM via analysis of expression quantitative trait loci (eQTL) within a single tissue. T2DM and obesity are affected by comprehensive sets of genes in multiple tissues. In the current study, gene expression levels in multiple human tissues from GEO datasets were analyzed, and 21 candidate genes displaying high percentages of differential expression were filtered out. Specifically, DENND1B, LYN, MRPL30, POC1B, PRKCB, RP4-655J12.3, HIBADH, and TMBIM4 were identified from the T2DM-control study, and BCAT1, BMP2K, CSRNP2, MYNN, NCKAP5L, SAP30BP, SLC35B4, SP1, BAP1, GRB14, HSP90AB1, ITGA5, and TOMM5 were identified from the obesity-control study. The majority of these genes are known to be involved in T2DM and obesity. Therefore, analysis of gene expression in various tissues using GEO datasets may be an effective and feasible method to determine novel or causal genes associated with T2DM and obesity.

  19. The autographa californica multiple nucleopolyhedrovirus ODV-E56 envelope protein is required for oral infectivity and can be functionally substituted by rachiplusia ou multiple nucleopolyhedrovirus ODV-E56

    USDA-ARS?s Scientific Manuscript database

    The Autographa californica multiple nucleopolyhedrovirus (AcMNPV) odv-e56 gene encodes an occlusion-derived virus (ODV)-specific envelope protein, ODV-E56. In a previous analysis, the odv-e56 gene was found to be under positive selection pressure, suggesting that it may be a determinant of viral ho...

  20. Frequency-dependent and correlational selection pressures have conflicting consequences for assortative mating in a color-polymorphic lizard, Uta stansburiana.

    PubMed

    Lancaster, Lesley T; McAdam, Andrew G; Hipsley, Christy A; Sinervo, Barry R

    2014-08-01

    Genetically determined polymorphisms incorporating multiple traits can persist in nature under chronic, fluctuating, and sometimes conflicting selection pressures. Balancing selection among morphs preserves equilibrium frequencies, while correlational selection maintains favorable trait combinations within each morph. Under negative frequency-dependent selection, females should mate (often disassortatively) with rare male morphotypes to produce conditionally fit offspring. Conversely, under correlational selection, females should mate assortatively to preserve coadapted gene complexes and avoid ontogenetic conflict. Using controlled breeding designs, we evaluated consequences of assortative mating patterns in color-polymorphic side-blotched lizards (Uta stansburiana), to identify conflict between these sources of selection. Females who mated disassortatively, and to conditionally high-quality males in the context of frequency-dependent selection, experienced highest fertility rates. In contrast, assortatively mated females experienced higher fetal viability rates. The trade-off between fertility and egg viability resulted in no overall fitness benefit to either assortative or disassortative mating patterns. These results suggest that ongoing conflict between correlational and frequency dependent selection in polymorphic populations may generate a trade-off between rare-morph advantage and phenotypic integration and between assortative and disassortative mating decisions. More generally, interactions among multiple sources of diversity-promoting selection can alter adaptations and dynamics predicted to arise under any of these regimes alone.

  1. MPHASYS: a mouse phenotype analysis system

    PubMed Central

    Calder, R Brent; Beems, Rudolf B; van Steeg, Harry; Mian, I Saira; Lohman, Paul HM; Vijg, Jan

    2007-01-01

    Background Systematic, high-throughput studies of mouse phenotypes have been hampered by the inability to analyze individual animal data from a multitude of sources in an integrated manner. Studies generally make comparisons at the level of genotype or treatment thereby excluding associations that may be subtle or involve compound phenotypes. Additionally, the lack of integrated, standardized ontologies and methodologies for data exchange has inhibited scientific collaboration and discovery. Results Here we introduce a Mouse Phenotype Analysis System (MPHASYS), a platform for integrating data generated by studies of mouse models of human biology and disease such as aging and cancer. This computational platform is designed to provide a standardized methodology for working with animal data; a framework for data entry, analysis and sharing; and ontologies and methodologies for ensuring accurate data capture. We describe the tools that currently comprise MPHASYS, primarily ones related to mouse pathology, and outline its use in a study of individual animal-specific patterns of multiple pathology in mice harboring a specific germline mutation in the DNA repair and transcription-specific gene Xpd. Conclusion MPHASYS is a system for analyzing multiple data types from individual animals. It provides a framework for developing data analysis applications, and tools for collecting and distributing high-quality data. The software is platform independent and freely available under an open-source license [1]. PMID:17553167

  2. Environmental and Genetic Determinants of Colony Morphology in Yeast

    PubMed Central

    Granek, Joshua A.; Magwene, Paul M.

    2010-01-01

    Nutrient stresses trigger a variety of developmental switches in the budding yeast Saccharomyces cerevisiae. One of the least understood of such responses is the development of complex colony morphology, characterized by intricate, organized, and strain-specific patterns of colony growth and architecture. The genetic bases of this phenotype and the key environmental signals involved in its induction have heretofore remained poorly understood. By surveying multiple strain backgrounds and a large number of growth conditions, we show that limitation for fermentable carbon sources coupled with a rich nitrogen source is the primary trigger for the colony morphology response in budding yeast. Using knockout mutants and transposon-mediated mutagenesis, we demonstrate that two key signaling networks regulating this response are the filamentous growth MAP kinase cascade and the Ras-cAMP-PKA pathway. We further show synergistic epistasis between Rim15, a kinase involved in integration of nutrient signals, and other genes in these pathways. Ploidy, mating-type, and genotype-by-environment interactions also appear to play a role in the controlling colony morphology. Our study highlights the high degree of network reuse in this model eukaryote; yeast use the same core signaling pathways in multiple contexts to integrate information about environmental and physiological states and generate diverse developmental outputs. PMID:20107600

  3. MMSET deregulation affects cell cycle progression and adhesion regulons in t(4;14) myeloma plasma cells

    PubMed Central

    Brito, Jose L.R.; Walker, Brian; Jenner, Matthew; Dickens, Nicholas J.; Brown, Nicola J.M.; Ross, Fiona M.; Avramidou, Athanasia; Irving, Julie A.E.; Gonzalez, David; Davies, Faith E.; Morgan, Gareth J.

    2009-01-01

    Background The recurrent immunoglobulin translocation, t(4;14)(p16;q32) occurs in 15% of multiple myeloma patients and is associated with poor prognosis, through an unknown mechanism. The t(4;14) up-regulates fibroblast growth factor receptor 3 (FGFR3) and multiple myeloma SET domain (MMSET) genes. The involvement of MMSET in the pathogenesis of t(4;14) multiple myeloma and the mechanism or genes deregulated by MMSET upregulation are still unclear. Design and Methods The expression of MMSET was analyzed using a novel antibody. The involvement of MMSET in t(4;14) myelomagenesis was assessed by small interfering RNA mediated knockdown combined with several biological assays. In addition, the differential gene expression of MMSET-induced knockdown was analyzed with expression microarrays. MMSET gene targets in primary patient material was analyzed by expression microarrays. Results We found that MMSET isoforms are expressed in multiple myeloma cell lines, being exclusively up-regulated in t(4;14)-positive cells. Suppression of MMSET expression affected cell proliferation by both decreasing cell viability and cell cycle progression of cells with the t(4;14) translocation. These findings were associated with reduced expression of genes involved in the regulation of cell cycle progression (e.g. CCND2, CCNG1, BRCA1, AURKA and CHEK1), apoptosis (CASP1, CASP4 and FOXO3A) and cell adhesion (ADAM9 and DSG2). Furthermore, we identified genes involved in the latter processes that were differentially expressed in t(4;14) multiple myeloma patient samples. Conclusions In conclusion, dysregulation of MMSET affects the expression of several genes involved in the regulation of cell cycle progression, cell adhesion and survival. PMID:19059936

  4. An evaluation of talker localization based on direction of arrival estimation and statistical sound source identification

    NASA Astrophysics Data System (ADS)

    Nishiura, Takanobu; Nakamura, Satoshi

    2002-11-01

    It is very important to capture distant-talking speech for a hands-free speech interface with high quality. A microphone array is an ideal candidate for this purpose. However, this approach requires localizing the target talker. Conventional talker localization algorithms in multiple sound source environments not only have difficulty localizing the multiple sound sources accurately, but also have difficulty localizing the target talker among known multiple sound source positions. To cope with these problems, we propose a new talker localization algorithm consisting of two algorithms. One is DOA (direction of arrival) estimation algorithm for multiple sound source localization based on CSP (cross-power spectrum phase) coefficient addition method. The other is statistical sound source identification algorithm based on GMM (Gaussian mixture model) for localizing the target talker position among localized multiple sound sources. In this paper, we particularly focus on the talker localization performance based on the combination of these two algorithms with a microphone array. We conducted evaluation experiments in real noisy reverberant environments. As a result, we confirmed that multiple sound signals can be identified accurately between ''speech'' or ''non-speech'' by the proposed algorithm. [Work supported by ATR, and MEXT of Japan.

  5. A BAC-bacterial recombination method to generate physically linked multiple gene reporter DNA constructs.

    PubMed

    Maye, Peter; Stover, Mary Louise; Liu, Yaling; Rowe, David W; Gong, Shiaochin; Lichtler, Alexander C

    2009-03-13

    Reporter gene mice are valuable animal models for biological research providing a gene expression readout that can contribute to cellular characterization within the context of a developmental process. With the advancement of bacterial recombination techniques to engineer reporter gene constructs from BAC genomic clones and the generation of optically distinguishable fluorescent protein reporter genes, there is an unprecedented capability to engineer more informative transgenic reporter mouse models relative to what has been traditionally available. We demonstrate here our first effort on the development of a three stage bacterial recombination strategy to physically link multiple genes together with their respective fluorescent protein (FP) reporters in one DNA fragment. This strategy uses bacterial recombination techniques to: (1) subclone genes of interest into BAC linking vectors, (2) insert desired reporter genes into respective genes and (3) link different gene-reporters together. As proof of concept, we have generated a single DNA fragment containing the genes Trap, Dmp1, and Ibsp driving the expression of ECFP, mCherry, and Topaz FP reporter genes, respectively. Using this DNA construct, we have successfully generated transgenic reporter mice that retain two to three gene readouts. The three stage methodology to link multiple genes with their respective fluorescent protein reporter works with reasonable efficiency. Moreover, gene linkage allows for their common chromosomal integration into a single locus. However, the testing of this multi-reporter DNA construct by transgenesis does suggest that the linkage of two different genes together, despite their large size, can still create a positional effect. We believe that gene choice, genomic DNA fragment size and the presence of endogenous insulator elements are critical variables.

  6. Non-syndromic odontogenic keratocysts: A rare case report

    PubMed Central

    Kurdekar, Raghavendra S.; Prakash, Jeevan; Rana, A. S.; Kalra, Puneet

    2013-01-01

    Odontogenic keratocysts are very well documented in the literature. Multiple odontogenic keratocysts (OKCs) are one of the most frequent features of nevoid basal cell carcinoma syndrome (NBCCS). It is linked with mutation in the PTCH gene (human homolog of the drosophila segment polarity gene, “patched”,). Partial expression of the gene may result in occurrence of only multiple recurring OKC without any associated systemic findings. A rare case of multiple odontogenic keratocysts unassociated with any syndrome is reported, so as to add to the growing number of such cases in the literature. The possibility of this case being a partial expression of the Gorlin-Goltz syndrome is discussed. PMID:24163561

  7. Impact of recombination on polymorphism of genes encoding Kunitz-type protease inhibitors in the genus Solanum.

    PubMed

    Speranskaya, Anna S; Krinitsina, Anastasia A; Kudryavtseva, Anna V; Poltronieri, Palmiro; Santino, Angelo; Oparina, Nina Y; Dmitriev, Alexey A; Belenikin, Maxim S; Guseva, Marina A; Shevelev, Alexei B

    2012-08-01

    The group of Kunitz-type protease inhibitors (KPI) from potato is encoded by a polymorphic family of multiple allelic and non-allelic genes. The previous explanations of the KPI variability were based on the hypothesis of random mutagenesis as a key factor of KPI polymorphism. KPI-A genes from the genomes of Solanum tuberosum cv. Istrinskii and the wild species Solanum palustre were amplified by PCR with subsequent cloning in plasmids. True KPI sequences were derived from comparison of the cloned copies. "Hot spots" of recombination in KPI genes were independently identified by DnaSP 4.0 and TOPALi v2.5 software. The KPI-A sequence from potato cv. Istrinskii was found to be 100% identical to the gene from Solanum nigrum. This fact illustrates a high degree of similarity of KPI genes in the genus Solanum. Pairwise comparison of KPI A and B genes unambiguously showed a non-uniform extent of polymorphism at different nt positions. Moreover, the occurrence of substitutions was not random along the strand. Taken together, these facts contradict the traditional hypothesis of random mutagenesis as a principal source of KPI gene polymorphism. The experimentally found mosaic structure of KPI genes in both plants studied is consistent with the hypothesis suggesting recombination of ancestral genes. The same mechanism was proposed earlier for other resistance-conferring genes in the nightshade family (Solanaceae). Based on the data obtained, we searched for potential motifs of site-specific binding with plant DNA recombinases. During this work, we analyzed the sequencing data reported by the Potato Genome Sequencing Consortium (PGSC), 2011 and found considerable inconsistence of their data concerning the number, location, and orientation of KPI genes of groups A and B. The key role of recombination rather than random point mutagenesis in KPI polymorphism was demonstrated for the first time. Copyright © 2012 Elsevier Masson SAS. All rights reserved.

  8. Horizontal acquisition of multiple mitochondrial genes from a parasitic plant followed by gene conversion with host mitochondrial genes

    PubMed Central

    2010-01-01

    Background Horizontal gene transfer (HGT) is relatively common in plant mitochondrial genomes but the mechanisms, extent and consequences of transfer remain largely unknown. Previous results indicate that parasitic plants are often involved as either transfer donors or recipients, suggesting that direct contact between parasite and host facilitates genetic transfer among plants. Results In order to uncover the mechanistic details of plant-to-plant HGT, the extent and evolutionary fate of transfer was investigated between two groups: the parasitic genus Cuscuta and a small clade of Plantago species. A broad polymerase chain reaction (PCR) survey of mitochondrial genes revealed that at least three genes (atp1, atp6 and matR) were recently transferred from Cuscuta to Plantago. Quantitative PCR assays show that these three genes have a mitochondrial location in the one species line of Plantago examined. Patterns of sequence evolution suggest that these foreign genes degraded into pseudogenes shortly after transfer and reverse transcription (RT)-PCR analyses demonstrate that none are detectably transcribed. Three cases of gene conversion were detected between native and foreign copies of the atp1 gene. The identical phylogenetic distribution of the three foreign genes within Plantago and the retention of cytidines at ancestral positions of RNA editing indicate that these genes were probably acquired via a single, DNA-mediated transfer event. However, samplings of multiple individuals from two of the three species in the recipient Plantago clade revealed complex and perplexing phylogenetic discrepancies and patterns of sequence divergence for all three of the foreign genes. Conclusions This study reports the best evidence to date that multiple mitochondrial genes can be transferred via a single HGT event and that transfer occurred via a strictly DNA-level intermediate. The discovery of gene conversion between co-resident foreign and native mitochondrial copies suggests that transferred genes may be evolutionarily important in generating mitochondrial genetic diversity. Finally, the complex relationships within each lineage of transferred genes imply a surprisingly complicated history of these genes in Plantago subsequent to their acquisition via HGT and this history probably involves some combination of additional transfers (including intracellular transfer), gene duplication, differential loss and mutation-rate variation. Unravelling this history will probably require sequencing multiple mitochondrial and nuclear genomes from Plantago. See Commentary: http://www.biomedcentral.com/1741-7007/8/147. PMID:21176201

  9. Human Health Risk Implications of Multiple Sources of Faecal Indicator Bacteria in a Recreational Waterbody

    EPA Science Inventory

    We evaluate the influence of multiple sources of faecal indicator bacteria in recreational water bodies on potential human health risk by considering waters impacted by human and animal sources, human and non-pathogenic sources, and animal and non-pathogenic sources. We illustrat...

  10. Incomplete Multisource Transfer Learning.

    PubMed

    Ding, Zhengming; Shao, Ming; Fu, Yun

    2018-02-01

    Transfer learning is generally exploited to adapt well-established source knowledge for learning tasks in weakly labeled or unlabeled target domain. Nowadays, it is common to see multiple sources available for knowledge transfer, each of which, however, may not include complete classes information of the target domain. Naively merging multiple sources together would lead to inferior results due to the large divergence among multiple sources. In this paper, we attempt to utilize incomplete multiple sources for effective knowledge transfer to facilitate the learning task in target domain. To this end, we propose an incomplete multisource transfer learning through two directional knowledge transfer, i.e., cross-domain transfer from each source to target, and cross-source transfer. In particular, in cross-domain direction, we deploy latent low-rank transfer learning guided by iterative structure learning to transfer knowledge from each single source to target domain. This practice reinforces to compensate for any missing data in each source by the complete target data. While in cross-source direction, unsupervised manifold regularizer and effective multisource alignment are explored to jointly compensate for missing data from one portion of source to another. In this way, both marginal and conditional distribution discrepancy in two directions would be mitigated. Experimental results on standard cross-domain benchmarks and synthetic data sets demonstrate the effectiveness of our proposed model in knowledge transfer from incomplete multiple sources.

  11. Small brown planthopper resistance loci in wild rice (Oryza officinalis).

    PubMed

    Zhang, Weilin; Dong, Yan; Yang, Ling; Ma, Bojun; Ma, Rongrong; Huang, Fudeng; Wang, Changchun; Hu, Haitao; Li, Chunshou; Yan, Chengqi; Chen, Jianping

    2014-06-01

    Host-plant resistance is the most practical and economical approach to control the rice planthoppers. However, up to date, few rice germplasm accessions that are resistant to the all three kinds of planthoppers (1) brown planthopper (BPH; Nilaparvata lugens Stål), (2) the small brown planthopper (SBPH; Laodelphax striatellus Fallen), and (3) the whitebacked planthopper (WBPH, Sogatella furcifera Horvath) have been identified; consequently, the genetic basis for host-plant broad spectrum resistance to rice planthoppers in a single variety has been seldom studied. Here, one wild species, Oryza officinalis (Acc. HY018, 2n = 24, CC), was detected showing resistance to the all three kinds of planthoppers. Because resistance to WBPH and BPH in O. officinalis has previously been reported, the study mainly focused on its SBPH resistance. The SBPH resistance gene(s) was (were) introduced into cultivated rice via asymmetric somatic hybridization. Three QTLs for SBPH resistance detected by the SSST method were mapped and confirmed on chromosomes 3, 7, and 12, respectively. The allelic/non-allelic relationship and relative map positions of the three kinds of planthopper resistance genes in O. officinalis show that the SBPH, WBPH, and BPH resistance genes in O. officinalis were governed by multiple genes, but not by any major gene. The data on the genetics of host-plant broad spectrum resistance to planthoppers in a single accession suggested that the most ideally practical and economical approach for rice breeders is to screen the sources of broad spectrum resistance to planthoppers, but not to employ broad spectrum resistance gene for the management of planthoppers. Pyramiding these genes in a variety can be an effective way for the management of planthoppers.

  12. Alternatives to vitamin B 1 uptake revealed with discovery of riboswitches in multiple marine eukaryotic lineages

    DOE PAGES

    McRose, Darcy; Guo, Jian; Monier, Adam; ...

    2014-08-29

    Here, vitamin B 1 (thiamine pyrophosphate, TPP) is essential to all life but scarce in ocean surface waters. In many bacteria and a few eukaryotic groups thiamine biosynthesis genes are controlled by metabolite-sensing mRNA-based gene regulators known as riboswitches. Using available genome sequences and transcriptomes generated from ecologically important marine phytoplankton, we identified 31 new eukaryotic riboswitches. These were found in alveolate, cryptophyte, haptophyte and rhizarian phytoplankton as well as taxa from two lineages previously known to have riboswitches (green algae and stramenopiles). The predicted secondary structures bear hallmarks of TPP-sensing riboswitches. Surprisingly, most of the identified riboswitches are affiliatedmore » with genes of unknown function, rather than characterized thiamine biosynthesis genes. Using qPCR and growth experiments involving two prasinophyte algae, we show that expression of these genes increases significantly under vitamin B 1-deplete conditions relative to controls. Pathway analyses show that several algae harboring the uncharacterized genes lack one or more enzymes in the known TPP biosynthesis pathway. We demonstrate that one such alga, the major primary producer Emiliania huxleyi, grows on 4-amino-5-hydroxymethyl-2-methylpyrimidine (a thiamine precursor moiety) alone, although long thought dependent on exogenous sources of thiamine. Thus, overall, we have identified riboswitches in major eukaryotic lineages not known to undergo this form of gene regulation. In these phytoplankton groups, riboswitches are often affiliated with widespread thiamine-responsive genes with as yet uncertain roles in TPP pathways. Further, taxa with ‘incomplete’ TPP biosynthesis pathways do not necessarily require exogenous vitamin B 1, making vitamin control of phytoplankton blooms more complex than the current paradigm suggests.« less

  13. Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining

    PubMed Central

    Kreula, Sanna M.; Kaewphan, Suwisa; Ginter, Filip

    2018-01-01

    The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from ‘reading the literature’. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already ‘known’, and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to (i) discover novel candidate associations between different genes or proteins in the network, and (ii) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource. PMID:29844966

  14. Alternatives to vitamin B 1 uptake revealed with discovery of riboswitches in multiple marine eukaryotic lineages

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McRose, Darcy; Guo, Jian; Monier, Adam

    Here, vitamin B 1 (thiamine pyrophosphate, TPP) is essential to all life but scarce in ocean surface waters. In many bacteria and a few eukaryotic groups thiamine biosynthesis genes are controlled by metabolite-sensing mRNA-based gene regulators known as riboswitches. Using available genome sequences and transcriptomes generated from ecologically important marine phytoplankton, we identified 31 new eukaryotic riboswitches. These were found in alveolate, cryptophyte, haptophyte and rhizarian phytoplankton as well as taxa from two lineages previously known to have riboswitches (green algae and stramenopiles). The predicted secondary structures bear hallmarks of TPP-sensing riboswitches. Surprisingly, most of the identified riboswitches are affiliatedmore » with genes of unknown function, rather than characterized thiamine biosynthesis genes. Using qPCR and growth experiments involving two prasinophyte algae, we show that expression of these genes increases significantly under vitamin B 1-deplete conditions relative to controls. Pathway analyses show that several algae harboring the uncharacterized genes lack one or more enzymes in the known TPP biosynthesis pathway. We demonstrate that one such alga, the major primary producer Emiliania huxleyi, grows on 4-amino-5-hydroxymethyl-2-methylpyrimidine (a thiamine precursor moiety) alone, although long thought dependent on exogenous sources of thiamine. Thus, overall, we have identified riboswitches in major eukaryotic lineages not known to undergo this form of gene regulation. In these phytoplankton groups, riboswitches are often affiliated with widespread thiamine-responsive genes with as yet uncertain roles in TPP pathways. Further, taxa with ‘incomplete’ TPP biosynthesis pathways do not necessarily require exogenous vitamin B 1, making vitamin control of phytoplankton blooms more complex than the current paradigm suggests.« less

  15. Multi-step formation, evolution, and functionalization of new cytoplasmic male sterility genes in the plant mitochondrial genomes

    PubMed Central

    Tang, Huiwu; Zheng, Xingmei; Li, Chuliang; Xie, Xianrong; Chen, Yuanling; Chen, Letian; Zhao, Xiucai; Zheng, Huiqi; Zhou, Jiajian; Ye, Shan; Guo, Jingxin; Liu, Yao-Guang

    2017-01-01

    New gene origination is a major source of genomic innovations that confer phenotypic changes and biological diversity. Generation of new mitochondrial genes in plants may cause cytoplasmic male sterility (CMS), which can promote outcrossing and increase fitness. However, how mitochondrial genes originate and evolve in structure and function remains unclear. The rice Wild Abortive type of CMS is conferred by the mitochondrial gene WA352c (previously named WA352) and has been widely exploited in hybrid rice breeding. Here, we reconstruct the evolutionary trajectory of WA352c by the identification and analyses of 11 mitochondrial genomic recombinant structures related to WA352c in wild and cultivated rice. We deduce that these structures arose through multiple rearrangements among conserved mitochondrial sequences in the mitochondrial genome of the wild rice Oryza rufipogon, coupled with substoichiometric shifting and sequence variation. We identify two expressed but nonfunctional protogenes among these structures, and show that they could evolve into functional CMS genes via sequence variations that could relieve the self-inhibitory potential of the proteins. These sequence changes would endow the proteins the ability to interact with the nucleus-encoded mitochondrial protein COX11, resulting in premature programmed cell death in the anther tapetum and male sterility. Furthermore, we show that the sequences that encode the COX11-interaction domains in these WA352c-related genes have experienced purifying selection during evolution. We propose a model for the formation and evolution of new CMS genes via a “multi-recombination/protogene formation/functionalization” mechanism involving gradual variations in the structure, sequence, copy number, and function. PMID:27725674

  16. Genomic and Epigenomic Insights into Nutrition and Brain Disorders

    PubMed Central

    Dauncey, Margaret Joy

    2013-01-01

    Considerable evidence links many neuropsychiatric, neurodevelopmental and neurodegenerative disorders with multiple complex interactions between genetics and environmental factors such as nutrition. Mental health problems, autism, eating disorders, Alzheimer’s disease, schizophrenia, Parkinson’s disease and brain tumours are related to individual variability in numerous protein-coding and non-coding regions of the genome. However, genotype does not necessarily determine neurological phenotype because the epigenome modulates gene expression in response to endogenous and exogenous regulators, throughout the life-cycle. Studies using both genome-wide analysis of multiple genes and comprehensive analysis of specific genes are providing new insights into genetic and epigenetic mechanisms underlying nutrition and neuroscience. This review provides a critical evaluation of the following related areas: (1) recent advances in genomic and epigenomic technologies, and their relevance to brain disorders; (2) the emerging role of non-coding RNAs as key regulators of transcription, epigenetic processes and gene silencing; (3) novel approaches to nutrition, epigenetics and neuroscience; (4) gene-environment interactions, especially in the serotonergic system, as a paradigm of the multiple signalling pathways affected in neuropsychiatric and neurological disorders. Current and future advances in these four areas should contribute significantly to the prevention, amelioration and treatment of multiple devastating brain disorders. PMID:23503168

  17. Integrative Analysis of Prognosis Data on Multiple Cancer Subtypes

    PubMed Central

    Liu, Jin; Huang, Jian; Zhang, Yawei; Lan, Qing; Rothman, Nathaniel; Zheng, Tongzhang; Ma, Shuangge

    2014-01-01

    Summary In cancer research, profiling studies have been extensively conducted, searching for genes/SNPs associated with prognosis. Cancer is diverse. Examining the similarity and difference in the genetic basis of multiple subtypes of the same cancer can lead to a better understanding of their connections and distinctions. Classic meta-analysis methods analyze each subtype separately and then compare analysis results across subtypes. Integrative analysis methods, in contrast, analyze the raw data on multiple subtypes simultaneously and can outperform meta-analysis methods. In this study, prognosis data on multiple subtypes of the same cancer are analyzed. An AFT (accelerated failure time) model is adopted to describe survival. The genetic basis of multiple subtypes is described using the heterogeneity model, which allows a gene/SNP to be associated with prognosis of some subtypes but not others. A compound penalization method is developed to identify genes that contain important SNPs associated with prognosis. The proposed method has an intuitive formulation and is realized using an iterative algorithm. Asymptotic properties are rigorously established. Simulation shows that the proposed method has satisfactory performance and outperforms a penalization-based meta-analysis method and a regularized thresholding method. An NHL (non-Hodgkin lymphoma) prognosis study with SNP measurements is analyzed. Genes associated with the three major subtypes, namely DLBCL, FL, and CLL/SLL, are identified. The proposed method identifies genes that are different from alternatives and have important implications and satisfactory prediction performance. PMID:24766212

  18. Genome-Wide Association Study of Anthracnose Resistance in Andean Beans (Phaseolus vulgaris).

    PubMed

    Zuiderveen, Grady H; Padder, Bilal A; Kamfwa, Kelvin; Song, Qijian; Kelly, James D

    2016-01-01

    Anthracnose is a seed-borne disease of common bean (Phaseolus vulgaris L.) caused by the fungus Colletotrichum lindemuthianum, and the pathogen is cosmopolitan in distribution. The objectives of this study were to identify new sources of anthracnose resistance in a diverse panel of 230 Andean beans comprised of multiple seed types and market classes from the Americas, Africa, and Europe, and explore the genetic basis of this resistance using genome-wide association mapping analysis (GWAS). Twenty-eight of the 230 lines tested were resistant to six out of the eight races screened, but only one cultivar Uyole98 was resistant to all eight races (7, 39, 55, 65, 73, 109, 2047, and 3481) included in the study. Outputs from the GWAS indicated major quantitative trait loci (QTL) for resistance on chromosomes, Pv01, Pv02, and Pv04 and two minor QTL on Pv10 and Pv11. Candidate genes associated with the significant SNPs were detected on all five chromosomes. An independent QTL study was conducted to confirm the physical location of the Co-1 locus identified on Pv01 in an F4:6 recombinant inbred line (RIL) population. Resistance was determined to be conditioned by the single dominant gene Co-1 that mapped between 50.16 and 50.30 Mb on Pv01, and an InDel marker (NDSU_IND_1_50.2219) tightly linked to the gene was developed. The information reported will provide breeders with new and diverse sources of resistance and genomic regions to target in the development of anthracnose resistance in Andean beans.

  19. Genome-Wide Association Study of Anthracnose Resistance in Andean Beans (Phaseolus vulgaris)

    PubMed Central

    Zuiderveen, Grady H.; Padder, Bilal A.; Kamfwa, Kelvin; Song, Qijian; Kelly, James D.

    2016-01-01

    Anthracnose is a seed-borne disease of common bean (Phaseolus vulgaris L.) caused by the fungus Colletotrichum lindemuthianum, and the pathogen is cosmopolitan in distribution. The objectives of this study were to identify new sources of anthracnose resistance in a diverse panel of 230 Andean beans comprised of multiple seed types and market classes from the Americas, Africa, and Europe, and explore the genetic basis of this resistance using genome-wide association mapping analysis (GWAS). Twenty-eight of the 230 lines tested were resistant to six out of the eight races screened, but only one cultivar Uyole98 was resistant to all eight races (7, 39, 55, 65, 73, 109, 2047, and 3481) included in the study. Outputs from the GWAS indicated major quantitative trait loci (QTL) for resistance on chromosomes, Pv01, Pv02, and Pv04 and two minor QTL on Pv10 and Pv11. Candidate genes associated with the significant SNPs were detected on all five chromosomes. An independent QTL study was conducted to confirm the physical location of the Co-1 locus identified on Pv01 in an F4:6 recombinant inbred line (RIL) population. Resistance was determined to be conditioned by the single dominant gene Co-1 that mapped between 50.16 and 50.30 Mb on Pv01, and an InDel marker (NDSU_IND_1_50.2219) tightly linked to the gene was developed. The information reported will provide breeders with new and diverse sources of resistance and genomic regions to target in the development of anthracnose resistance in Andean beans. PMID:27270627

  20. Genetic instrumental variable regression: Explaining socioeconomic and health outcomes in nonexperimental data

    PubMed Central

    DiPrete, Thomas A.; Burik, Casper A. P.; Koellinger, Philipp D.

    2018-01-01

    Identifying causal effects in nonexperimental data is an enduring challenge. One proposed solution that recently gained popularity is the idea to use genes as instrumental variables [i.e., Mendelian randomization (MR)]. However, this approach is problematic because many variables of interest are genetically correlated, which implies the possibility that many genes could affect both the exposure and the outcome directly or via unobserved confounding factors. Thus, pleiotropic effects of genes are themselves a source of bias in nonexperimental data that would also undermine the ability of MR to correct for endogeneity bias from nongenetic sources. Here, we propose an alternative approach, genetic instrumental variable (GIV) regression, that provides estimates for the effect of an exposure on an outcome in the presence of pleiotropy. As a valuable byproduct, GIV regression also provides accurate estimates of the chip heritability of the outcome variable. GIV regression uses polygenic scores (PGSs) for the outcome of interest which can be constructed from genome-wide association study (GWAS) results. By splitting the GWAS sample for the outcome into nonoverlapping subsamples, we obtain multiple indicators of the outcome PGSs that can be used as instruments for each other and, in combination with other methods such as sibling fixed effects, can address endogeneity bias from both pleiotropy and the environment. In two empirical applications, we demonstrate that our approach produces reasonable estimates of the chip heritability of educational attainment (EA) and show that standard regression and MR provide upwardly biased estimates of the effect of body height on EA. PMID:29686100

  1. Genetic instrumental variable regression: Explaining socioeconomic and health outcomes in nonexperimental data.

    PubMed

    DiPrete, Thomas A; Burik, Casper A P; Koellinger, Philipp D

    2018-05-29

    Identifying causal effects in nonexperimental data is an enduring challenge. One proposed solution that recently gained popularity is the idea to use genes as instrumental variables [i.e., Mendelian randomization (MR)]. However, this approach is problematic because many variables of interest are genetically correlated, which implies the possibility that many genes could affect both the exposure and the outcome directly or via unobserved confounding factors. Thus, pleiotropic effects of genes are themselves a source of bias in nonexperimental data that would also undermine the ability of MR to correct for endogeneity bias from nongenetic sources. Here, we propose an alternative approach, genetic instrumental variable (GIV) regression, that provides estimates for the effect of an exposure on an outcome in the presence of pleiotropy. As a valuable byproduct, GIV regression also provides accurate estimates of the chip heritability of the outcome variable. GIV regression uses polygenic scores (PGSs) for the outcome of interest which can be constructed from genome-wide association study (GWAS) results. By splitting the GWAS sample for the outcome into nonoverlapping subsamples, we obtain multiple indicators of the outcome PGSs that can be used as instruments for each other and, in combination with other methods such as sibling fixed effects, can address endogeneity bias from both pleiotropy and the environment. In two empirical applications, we demonstrate that our approach produces reasonable estimates of the chip heritability of educational attainment (EA) and show that standard regression and MR provide upwardly biased estimates of the effect of body height on EA. Copyright © 2018 the Author(s). Published by PNAS.

  2. Ecophysiology of Freshwater Verrucomicrobia Inferred from Metagenome-Assembled Genomes

    PubMed Central

    He, Shaomei; Stevens, Sarah L. R.; Chan, Leong-Keat; Bertilsson, Stefan; Glavina del Rio, Tijana; Tringe, Susannah G.; Malmstrom, Rex R.

    2017-01-01

    ABSTRACT Microbes are critical in carbon and nutrient cycling in freshwater ecosystems. Members of the Verrucomicrobia are ubiquitous in such systems, and yet their roles and ecophysiology are not well understood. In this study, we recovered 19 Verrucomicrobia draft genomes by sequencing 184 time-series metagenomes from a eutrophic lake and a humic bog that differ in carbon source and nutrient availabilities. These genomes span four of the seven previously defined Verrucomicrobia subdivisions and greatly expand knowledge of the genomic diversity of freshwater Verrucomicrobia. Genome analysis revealed their potential role as (poly)saccharide degraders in freshwater, uncovered interesting genomic features for this lifestyle, and suggested their adaptation to nutrient availabilities in their environments. Verrucomicrobia populations differ significantly between the two lakes in glycoside hydrolase gene abundance and functional profiles, reflecting the autochthonous and terrestrially derived allochthonous carbon sources of the two ecosystems, respectively. Interestingly, a number of genomes recovered from the bog contained gene clusters that potentially encode a novel porin-multiheme cytochrome c complex and might be involved in extracellular electron transfer in the anoxic humus-rich environment. Notably, most epilimnion genomes have large numbers of so-called “Planctomycete-specific” cytochrome c-encoding genes, which exhibited distribution patterns nearly opposite to those seen with glycoside hydrolase genes, probably associated with the different levels of environmental oxygen availability and carbohydrate complexity between lakes/layers. Overall, the recovered genomes represent a major step toward understanding the role, ecophysiology, and distribution of Verrucomicrobia in freshwater. IMPORTANCE Freshwater Verrucomicrobia spp. are cosmopolitan in lakes and rivers, and yet their roles and ecophysiology are not well understood, as cultured freshwater Verrucomicrobia spp. are restricted to one subdivision of this phylum. Here, we greatly expanded the known genomic diversity of this freshwater lineage by recovering 19 Verrucomicrobia draft genomes from 184 metagenomes collected from a eutrophic lake and a humic bog across multiple years. Most of these genomes represent the first freshwater representatives of several Verrucomicrobia subdivisions. Genomic analysis revealed Verrucomicrobia to be potential (poly)saccharide degraders and suggested their adaptation to carbon sources of different origins in the two contrasting ecosystems. We identified putative extracellular electron transfer genes and so-called “Planctomycete-specific” cytochrome c-encoding genes and identified their distinct distribution patterns between the lakes/layers. Overall, our analysis greatly advances the understanding of the function, ecophysiology, and distribution of freshwater Verrucomicrobia, while highlighting their potential role in freshwater carbon cycling. PMID:28959738

  3. Shared regulatory sites are abundant in the human genome and shed light on genome evolution and disease pleiotropy.

    PubMed

    Tong, Pin; Monahan, Jack; Prendergast, James G D

    2017-03-01

    Large-scale gene expression datasets are providing an increasing understanding of the location of cis-eQTLs in the human genome and their role in disease. However, little is currently known regarding the extent of regulatory site-sharing between genes. This is despite it having potentially wide-ranging implications, from the determination of the way in which genetic variants may shape multiple phenotypes to the understanding of the evolution of human gene order. By first identifying the location of non-redundant cis-eQTLs, we show that regulatory site-sharing is a relatively common phenomenon in the human genome, with over 10% of non-redundant regulatory variants linked to the expression of multiple nearby genes. We show that these shared, local regulatory sites are linked to high levels of chromatin looping between the regulatory sites and their associated genes. In addition, these co-regulated gene modules are found to be strongly conserved across mammalian species, suggesting that shared regulatory sites have played an important role in shaping human gene order. The association of these shared cis-eQTLs with multiple genes means they also appear to be unusually important in understanding the genetics of human phenotypes and pleiotropy, with shared regulatory sites more often linked to multiple human phenotypes than other regulatory variants. This study shows that regulatory site-sharing is likely an underappreciated aspect of gene regulation and has important implications for the understanding of various biological phenomena, including how the two and three dimensional structures of the genome have been shaped and the potential causes of disease pleiotropy outside coding regions.

  4. Analysis of Gene Expression Profiles of Soft Tissue Sarcoma Using a Combination of Knowledge-Based Filtering with Integration of Multiple Statistics

    PubMed Central

    Doi, Ayano; Ichinohe, Risa; Ikuyo, Yoriko; Takahashi, Teruyoshi; Marui, Shigetaka; Yasuhara, Koji; Nakamura, Tetsuro; Sugita, Shintaro; Sakamoto, Hiromi; Yoshida, Teruhiko; Hasegawa, Tadashi

    2014-01-01

    The diagnosis and treatment of soft tissue sarcomas (STS) have been difficult. Of the diverse histological subtypes, undifferentiated pleomorphic sarcoma (UPS) is particularly difficult to diagnose accurately, and its classification per se is still controversial. Recent advances in genomic technologies provide an excellent way to address such problems. However, it is often difficult, if not impossible, to identify definitive disease-associated genes using genome-wide analysis alone, primarily because of multiple testing problems. In the present study, we analyzed microarray data from 88 STS patients using a combination method that used knowledge-based filtering and a simulation based on the integration of multiple statistics to reduce multiple testing problems. We identified 25 genes, including hypoxia-related genes (e.g., MIF, SCD1, P4HA1, ENO1, and STAT1) and cell cycle- and DNA repair-related genes (e.g., TACC3, PRDX1, PRKDC, and H2AFY). These genes showed significant differential expression among histological subtypes, including UPS, and showed associations with overall survival. STAT1 showed a strong association with overall survival in UPS patients (logrank p = 1.84×10−6 and adjusted p value 2.99×10−3 after the permutation test). According to the literature, the 25 genes selected are useful not only as markers of differential diagnosis but also as prognostic/predictive markers and/or therapeutic targets for STS. Our combination method can identify genes that are potential prognostic/predictive factors and/or therapeutic targets in STS and possibly in other cancers. These disease-associated genes deserve further preclinical and clinical validation. PMID:25188299

  5. Network-Assisted Investigation of Combined Causal Signals from Genome-Wide Association Studies in Schizophrenia

    PubMed Central

    Jia, Peilin; Wang, Lily; Fanous, Ayman H.; Pato, Carlos N.; Edwards, Todd L.; Zhao, Zhongming

    2012-01-01

    With the recent success of genome-wide association studies (GWAS), a wealth of association data has been accomplished for more than 200 complex diseases/traits, proposing a strong demand for data integration and interpretation. A combinatory analysis of multiple GWAS datasets, or an integrative analysis of GWAS data and other high-throughput data, has been particularly promising. In this study, we proposed an integrative analysis framework of multiple GWAS datasets by overlaying association signals onto the protein-protein interaction network, and demonstrated it using schizophrenia datasets. Building on a dense module search algorithm, we first searched for significantly enriched subnetworks for schizophrenia in each single GWAS dataset and then implemented a discovery-evaluation strategy to identify module genes with consistent association signals. We validated the module genes in an independent dataset, and also examined them through meta-analysis of the related SNPs using multiple GWAS datasets. As a result, we identified 205 module genes with a joint effect significantly associated with schizophrenia; these module genes included a number of well-studied candidate genes such as DISC1, GNA12, GNA13, GNAI1, GPR17, and GRIN2B. Further functional analysis suggested these genes are involved in neuronal related processes. Additionally, meta-analysis found that 18 SNPs in 9 module genes had P meta<1×10−4, including the gene HLA-DQA1 located in the MHC region on chromosome 6, which was reported in previous studies using the largest cohort of schizophrenia patients to date. These results demonstrated our bi-directional network-based strategy is efficient for identifying disease-associated genes with modest signals in GWAS datasets. This approach can be applied to any other complex diseases/traits where multiple GWAS datasets are available. PMID:22792057

  6. Chemical effects in biological systems (CEBS) object model for toxicology data, SysTox-OM: design and application.

    PubMed

    Xirasagar, Sandhya; Gustafson, Scott F; Huang, Cheng-Cheng; Pan, Qinyan; Fostel, Jennifer; Boyer, Paul; Merrick, B Alex; Tomer, Kenneth B; Chan, Denny D; Yost, Kenneth J; Choi, Danielle; Xiao, Nianqing; Stasiewicz, Stanley; Bushel, Pierre; Waters, Michael D

    2006-04-01

    The CEBS data repository is being developed to promote a systems biology approach to understand the biological effects of environmental stressors. CEBS will house data from multiple gene expression platforms (transcriptomics), protein expression and protein-protein interaction (proteomics), and changes in low molecular weight metabolite levels (metabolomics) aligned by their detailed toxicological context. The system will accommodate extensive complex querying in a user-friendly manner. CEBS will store toxicological contexts including the study design details, treatment protocols, animal characteristics and conventional toxicological endpoints such as histopathology findings and clinical chemistry measures. All of these data types can be integrated in a seamless fashion to enable data query and analysis in a biologically meaningful manner. An object model, the SysBio-OM (Xirasagar et al., 2004) has been designed to facilitate the integration of microarray gene expression, proteomics and metabolomics data in the CEBS database system. We now report SysTox-OM as an open source systems toxicology model designed to integrate toxicological context into gene expression experiments. The SysTox-OM model is comprehensive and leverages other open source efforts, namely, the Standard for Exchange of Nonclinical Data (http://www.cdisc.org/models/send/v2/index.html) which is a data standard for capturing toxicological information for animal studies and Clinical Data Interchange Standards Consortium (http://www.cdisc.org/models/sdtm/index.html) that serves as a standard for the exchange of clinical data. Such standardization increases the accuracy of data mining, interpretation and exchange. The open source SysTox-OM model, which can be implemented on various software platforms, is presented here. A universal modeling language (UML) depiction of the entire SysTox-OM is available at http://cebs.niehs.nih.gov and the Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http://cebs.niehs.nih.gov/cebsdownloads. Currently, the public toxicological data in CEBS can be queried via a web application based on the SysTox-OM at http://cebs.niehs.nih.gov xirasagars@saic.com Supplementary data are available at Bioinformatics online.

  7. svdPPCS: an effective singular value decomposition-based method for conserved and divergent co-expression gene module identification.

    PubMed

    Zhang, Wensheng; Edwards, Andrea; Fan, Wei; Zhu, Dongxiao; Zhang, Kun

    2010-06-22

    Comparative analysis of gene expression profiling of multiple biological categories, such as different species of organisms or different kinds of tissue, promises to enhance the fundamental understanding of the universality as well as the specialization of mechanisms and related biological themes. Grouping genes with a similar expression pattern or exhibiting co-expression together is a starting point in understanding and analyzing gene expression data. In recent literature, gene module level analysis is advocated in order to understand biological network design and system behaviors in disease and life processes; however, practical difficulties often lie in the implementation of existing methods. Using the singular value decomposition (SVD) technique, we developed a new computational tool, named svdPPCS (SVD-based Pattern Pairing and Chart Splitting), to identify conserved and divergent co-expression modules of two sets of microarray experiments. In the proposed methods, gene modules are identified by splitting the two-way chart coordinated with a pair of left singular vectors factorized from the gene expression matrices of the two biological categories. Importantly, the cutoffs are determined by a data-driven algorithm using the well-defined statistic, SVD-p. The implementation was illustrated on two time series microarray data sets generated from the samples of accessory gland (ACG) and malpighian tubule (MT) tissues of the line W118 of M. drosophila. Two conserved modules and six divergent modules, each of which has a unique characteristic profile across tissue kinds and aging processes, were identified. The number of genes contained in these models ranged from five to a few hundred. Three to over a hundred GO terms were over-represented in individual modules with FDR < 0.1. One divergent module suggested the tissue-specific relationship between the expressions of mitochondrion-related genes and the aging process. This finding, together with others, may be of biological significance. The validity of the proposed SVD-based method was further verified by a simulation study, as well as the comparisons with regression analysis and cubic spline regression analysis plus PAM based clustering. svdPPCS is a novel computational tool for the comparative analysis of transcriptional profiling. It especially fits the comparison of time series data of related organisms or different tissues of the same organism under equivalent or similar experimental conditions. The general scheme can be directly extended to the comparisons of multiple data sets. It also can be applied to the integration of data sets from different platforms and of different sources.

  8. cMapper: gene-centric connectivity mapper for EBI-RDF platform.

    PubMed

    Shoaib, Muhammad; Ansari, Adnan Ahmad; Ahn, Sung-Min

    2017-01-15

    In this era of biological big data, data integration has become a common task and a challenge for biologists. The Resource Description Framework (RDF) was developed to enable interoperability of heterogeneous datasets. The EBI-RDF platform enables an efficient data integration of six independent biological databases using RDF technologies and shared ontologies. However, to take advantage of this platform, biologists need to be familiar with RDF technologies and SPARQL query language. To overcome this practical limitation of the EBI-RDF platform, we developed cMapper, a web-based tool that enables biologists to search the EBI-RDF databases in a gene-centric manner without a thorough knowledge of RDF and SPARQL. cMapper allows biologists to search data entities in the EBI-RDF platform that are connected to genes or small molecules of interest in multiple biological contexts. The input to cMapper consists of a set of genes or small molecules, and the output are data entities in six independent EBI-RDF databases connected with the given genes or small molecules in the user's query. cMapper provides output to users in the form of a graph in which nodes represent data entities and the edges represent connections between data entities and inputted set of genes or small molecules. Furthermore, users can apply filters based on database, taxonomy, organ and pathways in order to focus on a core connectivity graph of their interest. Data entities from multiple databases are differentiated based on background colors. cMapper also enables users to investigate shared connections between genes or small molecules of interest. Users can view the output graph on a web browser or download it in either GraphML or JSON formats. cMapper is available as a web application with an integrated MySQL database. The web application was developed using Java and deployed on Tomcat server. We developed the user interface using HTML5, JQuery and the Cytoscape Graph API. cMapper can be accessed at http://cmapper.ewostech.net Readers can download the development manual from the website http://cmapper.ewostech.net/docs/cMapperDocumentation.pdf. Source Code is available at https://github.com/muhammadshoaib/cmapperContact:smahn@gachon.ac.krSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.

    PubMed

    Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li

    2015-10-16

    The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.

  10. The impact of the metabotropic glutamate receptor and other gene family interaction networks on autism

    PubMed Central

    Hadley, Dexter; Wu, Zhi-liang; Kao, Charlly; Kini, Akshata; Mohamed-Hadley, Alisha; Thomas, Kelly; Vazquez, Lyam; Qiu, Haijun; Mentch, Frank; Pellegrino, Renata; Kim, Cecilia; Connolly, John; Pinto, Dalila; Merikangas, Alison; Klei, Lambertus; Vorstman, Jacob A.S.; Thompson, Ann; Regan, Regina; Pagnamenta, Alistair T.; Oliveira, Bárbara; Magalhaes, Tiago R.; Gilbert, John; Duketis, Eftichia; De Jonge, Maretha V.; Cuccaro, Michael; Correia, Catarina T.; Conroy, Judith; Conceição, Inês C.; Chiocchetti, Andreas G.; Casey, Jillian P.; Bolshakova, Nadia; Bacchelli, Elena; Anney, Richard; Zwaigenbaum, Lonnie; Wittemeyer, Kerstin; Wallace, Simon; Engeland, Herman van; Soorya, Latha; Rogé, Bernadette; Roberts, Wendy; Poustka, Fritz; Mouga, Susana; Minshew, Nancy; McGrew, Susan G.; Lord, Catherine; Leboyer, Marion; Le Couteur, Ann S.; Kolevzon, Alexander; Jacob, Suma; Guter, Stephen; Green, Jonathan; Green, Andrew; Gillberg, Christopher; Fernandez, Bridget A.; Duque, Frederico; Delorme, Richard; Dawson, Geraldine; Café, Cátia; Brennan, Sean; Bourgeron, Thomas; Bolton, Patrick F.; Bölte, Sven; Bernier, Raphael; Baird, Gillian; Bailey, Anthony J.; Anagnostou, Evdokia; Almeida, Joana; Wijsman, Ellen M.; Vieland, Veronica J.; Vicente, Astrid M.; Schellenberg, Gerard D.; Pericak-Vance, Margaret; Paterson, Andrew D.; Parr, Jeremy R.; Oliveira, Guiomar; Almeida, Joana; Café, Cátia; Mouga, Susana; Correia, Catarina; Nurnberger, John I.; Monaco, Anthony P.; Maestrini, Elena; Klauck, Sabine M.; Hakonarson, Hakon; Haines, Jonathan L.; Geschwind, Daniel H.; Freitag, Christine M.; Folstein, Susan E.; Ennis, Sean; Coon, Hilary; Battaglia, Agatino; Szatmari, Peter; Sutcliffe, James S.; Hallmayer, Joachim; Gill, Michael; Cook, Edwin H.; Buxbaum, Joseph D.; Devlin, Bernie; Gallagher, Louise; Betancur, Catalina; Scherer, Stephen W.; Glessner, Joseph; Hakonarson, Hakon

    2014-01-01

    Although multiple reports show that defective genetic networks underlie the aetiology of autism, few have translated into pharmacotherapeutic opportunities. Since drugs compete with endogenous small molecules for protein binding, many successful drugs target large gene families with multiple drug binding sites. Here we search for defective gene family interaction networks (GFINs) in 6,742 patients with the ASDs relative to 12,544 neurologically normal controls, to find potentially druggable genetic targets. We find significant enrichment of structural defects (P≤2.40E−09, 1.8-fold enrichment) in the metabotropic glutamate receptor (GRM) GFIN, previously observed to impact attention deficit hyperactivity disorder (ADHD) and schizophrenia. Also, the MXD-MYC-MAX network of genes, previously implicated in cancer, is significantly enriched (P≤3.83E−23, 2.5-fold enrichment), as is the calmodulin 1 (CALM1) gene interaction network (P≤4.16E−04, 14.4-fold enrichment), which regulates voltage-independent calcium-activated action potentials at the neuronal synapse. We find that multiple defective gene family interactions underlie autism, presenting new translational opportunities to explore for therapeutic interventions. PMID:24927284

  11. CEBS object model for systems biology data, SysBio-OM.

    PubMed

    Xirasagar, Sandhya; Gustafson, Scott; Merrick, B Alex; Tomer, Kenneth B; Stasiewicz, Stanley; Chan, Denny D; Yost, Kenneth J; Yates, John R; Sumner, Susan; Xiao, Nianqing; Waters, Michael D

    2004-09-01

    To promote a systems biology approach to understanding the biological effects of environmental stressors, the Chemical Effects in Biological Systems (CEBS) knowledge base is being developed to house data from multiple complex data streams in a systems friendly manner that will accommodate extensive querying from users. Unified data representation via a single object model will greatly aid in integrating data storage and management, and facilitate reuse of software to analyze and display data resulting from diverse differential expression or differential profile technologies. Data streams include, but are not limited to, gene expression analysis (transcriptomics), protein expression and protein-protein interaction analysis (proteomics) and changes in low molecular weight metabolite levels (metabolomics). To enable the integration of microarray gene expression, proteomics and metabolomics data in the CEBS system, we designed an object model, Systems Biology Object Model (SysBio-OM). The model is comprehensive and leverages other open source efforts, namely the MicroArray Gene Expression Object Model (MAGE-OM) and the Proteomics Experiment Data Repository (PEDRo) object model. SysBio-OM is designed by extending MAGE-OM to represent protein expression data elements (including those from PEDRo), protein-protein interaction and metabolomics data. SysBio-OM promotes the standardization of data representation and data quality by facilitating the capture of the minimum annotation required for an experiment. Such standardization refines the accuracy of data mining and interpretation. The open source SysBio-OM model, which can be implemented on varied computing platforms is presented here. A universal modeling language depiction of the entire SysBio-OM is available at http://cebs.niehs.nih.gov/SysBioOM/. The Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http://cebs.niehs.nih.gov/cebsdownloads. The database and interface are being built to implement the model and will be available for public use at http://cebs.niehs.nih.gov.

  12. Evaluation of Escherichia coli isolates from healthy chickens to determine their potential risk to poultry and human health.

    PubMed

    Stromberg, Zachary R; Johnson, James R; Fairbrother, John M; Kilbourne, Jacquelyn; Van Goor, Angelica; Curtiss, Roy; Mellata, Melha

    2017-01-01

    Extraintestinal pathogenic Escherichia coli (ExPEC) strains are important pathogens that cause diverse diseases in humans and poultry. Some E. coli isolates from chicken feces contain ExPEC-associated virulence genes, so appear potentially pathogenic; they conceivably could be transmitted to humans through handling and/or consumption of contaminated meat. However, the actual extraintestinal virulence potential of chicken-source fecal E. coli is poorly understood. Here, we assessed whether fecal E. coli isolates from healthy production chickens could cause diseases in a chicken model of avian colibacillosis and three rodent models of ExPEC-associated human infections. From 304 E. coli isolates from chicken fecal samples, 175 E. coli isolates were screened by PCR for virulence genes associated with human-source ExPEC or avian pathogenic E. coli (APEC), an ExPEC subset that causes extraintestinal infections in poultry. Selected isolates genetically identified as ExPEC and non-ExPEC isolates were assessed in vitro for virulence-associated phenotypes, and in vivo for disease-causing ability in animal models of colibacillosis, sepsis, meningitis, and urinary tract infection. Among the study isolates, 13% (40/304) were identified as ExPEC; the majority of these were classified as APEC and uropathogenic E. coli, but none as neonatal meningitis E. coli. Multiple chicken-source fecal ExPEC isolates resembled avian and human clinical ExPEC isolates in causing one or more ExPEC-associated illnesses in experimental animal infection models. Additionally, some isolates that were classified as non-ExPEC were able to cause ExPEC-associated illnesses in animal models, and thus future studies are needed to elucidate their mechanisms of virulence. These findings show that E. coli isolates from chicken feces contain ExPEC-associated genes, exhibit ExPEC-associated in vitro phenotypes, and can cause ExPEC-associated infections in animal models, and thus may pose a health threat to poultry and consumers.

  13. Evaluation of Escherichia coli isolates from healthy chickens to determine their potential risk to poultry and human health

    PubMed Central

    Johnson, James R.; Fairbrother, John M.; Kilbourne, Jacquelyn; Van Goor, Angelica; Curtiss, Roy; Mellata, Melha

    2017-01-01

    Extraintestinal pathogenic Escherichia coli (ExPEC) strains are important pathogens that cause diverse diseases in humans and poultry. Some E. coli isolates from chicken feces contain ExPEC-associated virulence genes, so appear potentially pathogenic; they conceivably could be transmitted to humans through handling and/or consumption of contaminated meat. However, the actual extraintestinal virulence potential of chicken-source fecal E. coli is poorly understood. Here, we assessed whether fecal E. coli isolates from healthy production chickens could cause diseases in a chicken model of avian colibacillosis and three rodent models of ExPEC-associated human infections. From 304 E. coli isolates from chicken fecal samples, 175 E. coli isolates were screened by PCR for virulence genes associated with human-source ExPEC or avian pathogenic E. coli (APEC), an ExPEC subset that causes extraintestinal infections in poultry. Selected isolates genetically identified as ExPEC and non-ExPEC isolates were assessed in vitro for virulence-associated phenotypes, and in vivo for disease-causing ability in animal models of colibacillosis, sepsis, meningitis, and urinary tract infection. Among the study isolates, 13% (40/304) were identified as ExPEC; the majority of these were classified as APEC and uropathogenic E. coli, but none as neonatal meningitis E. coli. Multiple chicken-source fecal ExPEC isolates resembled avian and human clinical ExPEC isolates in causing one or more ExPEC-associated illnesses in experimental animal infection models. Additionally, some isolates that were classified as non-ExPEC were able to cause ExPEC-associated illnesses in animal models, and thus future studies are needed to elucidate their mechanisms of virulence. These findings show that E. coli isolates from chicken feces contain ExPEC-associated genes, exhibit ExPEC-associated in vitro phenotypes, and can cause ExPEC-associated infections in animal models, and thus may pose a health threat to poultry and consumers. PMID:28671990

  14. hTERT gene immortalized human adipose-derived stem cells and its multiple differentiations: a preliminary investigation.

    PubMed

    Wang, L; Song, K; Qu, X; Wang, H; Zhu, H; Xu, X; Zhang, M; Tang, Y; Yang, X

    2013-03-01

    Human adipose-derived adult stem cells (hADSCs) can express human telomerase reverse transcriptase phenotypes under an appropriate culture condition. Because adipose tissue is abundant and easily accessible, hADSCs offer a promising source of stem cells for tissue engineering application and other cell-based therapies. However, the shortage of cells number and the difficulty to proliferate, known as the "Hayflick limit" in vitro, limit their further clinical application. Here, hADSCs were transfected with human telomerase reverse transcriptase (hTERT) gene by the lentiviral vector to prolong the lifespan of stem cells and even immortalize them. Following to this, the cellular properties and functionalities of the transfected cell lines were assayed. The results demonstrated that hADSCs had been successfully transfected with hTERT gene (hTERT-ADSCs). Then, hTERT-ADSCs were initially selected by G418 and subsequently expanded over 20 passages in vitro. Moreover, the qualitative and quantitative differentiation criteria for 20 passages of hTERT-ADSCs also demonstrated that hTERT-ADSCs could differentiate into osteogenesis, chondrogenesis, and adipogenesis phenotypes in lineage-specific differentiation media. These findings confirmed that this transfection could prolong the lifespan of hADSCs.

  15. Data-Driven Discovery of Extravasation Pathway in Circulating Tumor Cells

    PubMed Central

    Yadavalli, S.; Jayaram, S.; Manda, S. S.; Madugundu, A. K.; Nayakanti, D. S.; Tan, T. Z.; Bhat, R.; Rangarajan, A.; Chatterjee, A.; Gowda, H.; Thiery, J. P.; Kumar, P.

    2017-01-01

    Circulating tumor cells (CTCs) play a crucial role in cancer dissemination and provide a promising source of blood-based markers. Understanding the spectrum of transcriptional profiles of CTCs and their corresponding regulatory mechanisms will allow for a more robust analysis of CTC phenotypes. The current challenge in CTC research is the acquisition of useful clinical information from the multitude of high-throughput studies. To gain a deeper understanding of CTC heterogeneity and identify genes, pathways and processes that are consistently affected across tumors, we mined the literature for gene expression profiles in CTCs. Through in silico analysis and the integration of CTC-specific genes, we found highly significant biological mechanisms and regulatory processes acting in CTCs across various cancers, with a particular enrichment of the leukocyte extravasation pathway. This pathway appears to play a pivotal role in the migration of CTCs to distant metastatic sites. We find that CTCs from multiple cancers express both epithelial and mesenchymal markers in varying amounts, which is suggestive of dynamic and hybrid states along the epithelial-mesenchymal transition (EMT) spectrum. Targeting the specific molecular nodes to monitor disease and therapeutic control of CTCs in real time will likely improve the clinical management of cancer progression and metastases. PMID:28262832

  16. The Escherichia coli Serogroup O1 and O2 Lipopolysaccharides Are Encoded by Multiple O-antigen Gene Clusters.

    PubMed

    Delannoy, Sabine; Beutin, Lothar; Mariani-Kurkdjian, Patricia; Fleiss, Aubin; Bonacorsi, Stéphane; Fach, Patrick

    2017-01-01

    Escherichia coli strains belonging to serogroups O1 and O2 are frequently associated with human infections, especially extra-intestinal infections such as bloodstream infections or urinary tract infections. These strains can be associated with a large array of flagellar antigens. Because of their frequency and clinical importance, a reliable detection of E. coli O1 and O2 strains and also the frequently associated K1 capsule is important for diagnosis and source attribution of E. coli infections in humans and animals. By sequencing the O-antigen clusters of various O1 and O2 strains we showed that the serogroups O1 and O2 are encoded by different sets of O-antigen encoding genes and identified potentially new O-groups. We developed qPCR-assays to detect the various O1 and O2 variants and the K1-encoding gene. These qPCR assays proved to be 100% sensitive and 100% specific and could be valuable tools for the investigations of zoonotic and food-borne infection of humans with O1 and O2 extra-intestinal (ExPEC) or Shiga toxin-producing E. coli (STEC) strains.

  17. The Escherichia coli Serogroup O1 and O2 Lipopolysaccharides Are Encoded by Multiple O-antigen Gene Clusters

    PubMed Central

    Delannoy, Sabine; Beutin, Lothar; Mariani-Kurkdjian, Patricia; Fleiss, Aubin; Bonacorsi, Stéphane; Fach, Patrick

    2017-01-01

    Escherichia coli strains belonging to serogroups O1 and O2 are frequently associated with human infections, especially extra-intestinal infections such as bloodstream infections or urinary tract infections. These strains can be associated with a large array of flagellar antigens. Because of their frequency and clinical importance, a reliable detection of E. coli O1 and O2 strains and also the frequently associated K1 capsule is important for diagnosis and source attribution of E. coli infections in humans and animals. By sequencing the O-antigen clusters of various O1 and O2 strains we showed that the serogroups O1 and O2 are encoded by different sets of O-antigen encoding genes and identified potentially new O-groups. We developed qPCR-assays to detect the various O1 and O2 variants and the K1-encoding gene. These qPCR assays proved to be 100% sensitive and 100% specific and could be valuable tools for the investigations of zoonotic and food-borne infection of humans with O1 and O2 extra-intestinal (ExPEC) or Shiga toxin-producing E. coli (STEC) strains. PMID:28224115

  18. Seven gene deletions in seven days: Fast generation of Escherichia coli strains tolerant to acetate and osmotic stress

    PubMed Central

    Jensen, Sheila I.; Lennen, Rebecca M.; Herrgård, Markus J.; Nielsen, Alex T.

    2015-01-01

    Generation of multiple genomic alterations is currently a time consuming process. Here, a method was established that enables highly efficient and simultaneous deletion of multiple genes in Escherichia coli. A temperature sensitive plasmid containing arabinose inducible lambda Red recombineering genes and a rhamnose inducible flippase recombinase was constructed to facilitate fast marker-free deletions. To further speed up the procedure, we integrated the arabinose inducible lambda Red recombineering genes and the rhamnose inducible FLP into the genome of E. coli K-12 MG1655. This system enables growth at 37 °C, thereby facilitating removal of integrated antibiotic cassettes and deletion of additional genes in the same day. Phosphorothioated primers were demonstrated to enable simultaneous deletions during one round of electroporation. Utilizing these methods, we constructed strains in which four to seven genes were deleted in E. coli W and E. coli K-12. The growth rate of an E. coli K-12 quintuple deletion strain was significantly improved in the presence of high concentrations of acetate and NaCl. In conclusion, we have generated a method that enables efficient and simultaneous deletion of multiple genes in several E. coli variants. The method enables deletion of up to seven genes in as little as seven days. PMID:26643270

  19. Discovery of cancer common and specific driver gene sets

    PubMed Central

    2017-01-01

    Abstract Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. PMID:28168295

  20. The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters.

    PubMed

    Blin, Kai; Medema, Marnix H; Kottmann, Renzo; Lee, Sang Yup; Weber, Tilmann

    2017-01-04

    Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Isolation and proteomic analysis of Chlamydomonas centrioles.

    PubMed

    Keller, Lani C; Marshall, Wallace F

    2008-01-01

    Centrioles are barrel-shaped cytoskeletal organelles composed of nine triplet microtubules blades arranged in a pinwheel-shaped array. Centrioles are required for recruitment of pericentriolar material (PCM) during centrosome formation, and they act as basal bodies, which are necessary for the outgrowth of cilia and flagella. Despite being described over a hundred years ago, centrioles are still among the most enigmatic organelles in all of cell biology. To gain molecular insights into the function and assembly of centrioles, we sought to determine the composition of the centriole proteome. Here, we describe a method that allows for the isolation of virtually "naked" centrioles, with little to no obscuring PCM, from the green alga, Chlamydomonas. Proteomic analysis of this material provided evidence that multiple human disease gene products encode protein components of the centriole, including genes involved in Meckel syndrome and Oral-Facial-Digital syndrome. Isolated centrioles can be used in combination with a wide variety of biochemical assays in addition to being utilized as a source for proteomic analysis.

  2. Control of lens development by Lhx2-regulated neuroretinal FGFs

    PubMed Central

    Thein, Thuzar; de Melo, Jimmy; Zibetti, Cristina; Clark, Brian S.; Juarez, Felicia

    2016-01-01

    Fibroblast growth factor (FGF) signaling is an essential regulator of lens epithelial cell proliferation and survival, as well as lens fiber cell differentiation. However, the identities of these FGF factors, their source tissue and the genes that regulate their synthesis are unknown. We have found that Chx10-Cre;Lhx2lox/lox mice, which selectively lack Lhx2 expression in neuroretina from E10.5, showed an early arrest in lens fiber development along with severe microphthalmia. These mutant animals showed reduced expression of multiple neuroretina-expressed FGFs and canonical FGF-regulated genes in neuroretina. When FGF expression was genetically restored in Lhx2-deficient neuroretina of Chx10-Cre;Lhx2lox/lox mice, we observed a partial but nonetheless substantial rescue of the defects in lens cell proliferation, survival and fiber differentiation. These data demonstrate that neuroretinal expression of Lhx2 and neuroretina-derived FGF factors are crucial for lens fiber development in vivo. PMID:27633990

  3. Antimicrobial Use and Antimicrobial Resistance: A Population Perspective

    PubMed Central

    Samore, Matthew H.

    2002-01-01

    The need to stem the growing problem of antimicrobial resistance has prompted multiple, sometimes conflicting, calls for changes in the use of antimicrobial agents. One source of disagreement concerns the major mechanisms by which antibiotics select resistant strains. For infections like tuberculosis, in which resistance can emerge in treated hosts through mutation, prevention of antimicrobial resistance in individual hosts is a primary method of preventing the spread of resistant organisms in the community. By contrast, for many other important resistant pathogens, such as penicillin-resistant Streptococcus pneumoniae, methicillin-resistant Staphylococcus aureus, and vancomycin-resistant Enterococcus faecium resistance is mediated by the acquisition of genes or gene fragments by horizontal transfer; resistance in the treated host is a relatively rare event. For these organisms, indirect, population-level mechanisms of selection account for the increase in the prevalence of resistance. These mechanisms can operate even when treatment has a modest, or even negative, effect on an individual host’s colonization with resistant organisms. PMID:11971765

  4. Metabolic engineering approaches for production of biochemicals in food and medicinal plants.

    PubMed

    Wilson, Sarah A; Roberts, Susan C

    2014-04-01

    Historically, plants are a vital source of nutrients and pharmaceuticals. Recent advances in metabolic engineering have made it possible to not only increase the concentration of desired compounds, but also introduce novel biosynthetic pathways to a variety of species, allowing for enhanced nutritional or commercial value. To improve metabolic engineering capabilities, new transformation techniques have been developed to allow for gene specific silencing strategies or stacking of multiple genes within the same region of the chromosome. The 'omics' era has provided a new resource for elucidation of uncharacterized biosynthetic pathways, enabling novel metabolic engineering approaches. These resources are now allowing for advanced metabolic engineering of plant production systems, as well as the synthesis of increasingly complex products in engineered microbial hosts. The status of current metabolic engineering efforts is highlighted for the in vitro production of paclitaxel and the in vivo production of β-carotene in Golden Rice and other food crops. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Reconstructing genome evolution in historic samples of the Irish potato famine pathogen

    PubMed Central

    Martin, Michael D.; Cappellini, Enrico; Samaniego, Jose A.; Zepeda, M. Lisandra; Campos, Paula F.; Seguin-Orlando, Andaine; Wales, Nathan; Orlando, Ludovic; Ho, Simon Y. W.; Dietrich, Fred S.; Mieczkowski, Piotr A.; Heitman, Joseph; Willerslev, Eske; Krogh, Anders; Ristaino, Jean B.; Gilbert, M. Thomas P.

    2013-01-01

    Responsible for the Irish potato famine of 1845–49, the oomycete pathogen Phytophthora infestans caused persistent, devastating outbreaks of potato late blight across Europe in the 19th century. Despite continued interest in the history and spread of the pathogen, the genome of the famine-era strain remains entirely unknown. Here we characterize temporal genomic changes in introduced P. infestans. We shotgun sequence five 19th-century European strains from archival herbarium samples—including the oldest known European specimen, collected in 1845 from the first reported source of introduction. We then compare their genomes to those of extant isolates. We report multiple distinct genotypes in historical Europe and a suite of infection-related genes different from modern strains. At virulence-related loci, several now-ubiquitous genotypes were absent from the historical gene pool. At least one of these genotypes encodes a virulent phenotype in modern strains, which helps explain the 20th century’s episodic replacements of European P. infestans lineages. PMID:23863894

  6. Multiple abiotic stimuli are integrated in the regulation of rice gene expression under field conditions

    PubMed Central

    Plessis, Anne; Hafemeister, Christoph; Wilkins, Olivia; Gonzaga, Zennia Jean; Meyer, Rachel Sarah; Pires, Inês; Müller, Christian; Septiningsih, Endang M; Bonneau, Richard; Purugganan, Michael

    2015-01-01

    Plants rely on transcriptional dynamics to respond to multiple climatic fluctuations and contexts in nature. We analyzed the genome-wide gene expression patterns of rice (Oryza sativa) growing in rainfed and irrigated fields during two distinct tropical seasons and determined simple linear models that relate transcriptomic variation to climatic fluctuations. These models combine multiple environmental parameters to account for patterns of expression in the field of co-expressed gene clusters. We examined the similarities of our environmental models between tropical and temperate field conditions, using previously published data. We found that field type and macroclimate had broad impacts on transcriptional responses to environmental fluctuations, especially for genes involved in photosynthesis and development. Nevertheless, variation in solar radiation and temperature at the timescale of hours had reproducible effects across environmental contexts. These results provide a basis for broad-based predictive modeling of plant gene expression in the field. DOI: http://dx.doi.org/10.7554/eLife.08411.001 PMID:26609814

  7. DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes

    PubMed Central

    Van Deun, Katrijn; Van Mechelen, Iven; Thorrez, Lieven; Schouteden, Martijn; De Moor, Bart; van der Werf, Mariët J.; De Lathauwer, Lieven; Smilde, Age K.; Kiers, Henk A. L.

    2012-01-01

    Background In systems biology it is common to obtain for the same set of biological entities information from multiple sources. Examples include expression data for the same set of orthologous genes screened in different organisms and data on the same set of culture samples obtained with different high-throughput techniques. A major challenge is to find the important biological processes underlying the data and to disentangle therein processes common to all data sources and processes distinctive for a specific source. Recently, two promising simultaneous data integration methods have been proposed to attain this goal, namely generalized singular value decomposition (GSVD) and simultaneous component analysis with rotation to common and distinctive components (DISCO-SCA). Results Both theoretical analyses and applications to biologically relevant data show that: (1) straightforward applications of GSVD yield unsatisfactory results, (2) DISCO-SCA performs well, (3) provided proper pre-processing and algorithmic adaptations, GSVD reaches a performance level similar to that of DISCO-SCA, and (4) DISCO-SCA is directly generalizable to more than two data sources. The biological relevance of DISCO-SCA is illustrated with two applications. First, in a setting of comparative genomics, it is shown that DISCO-SCA recovers a common theme of cell cycle progression and a yeast-specific response to pheromones. The biological annotation was obtained by applying Gene Set Enrichment Analysis in an appropriate way. Second, in an application of DISCO-SCA to metabolomics data for Escherichia coli obtained with two different chemical analysis platforms, it is illustrated that the metabolites involved in some of the biological processes underlying the data are detected by one of the two platforms only; therefore, platforms for microbial metabolomics should be tailored to the biological question. Conclusions Both DISCO-SCA and properly applied GSVD are promising integrative methods for finding common and distinctive processes in multisource data. Open source code for both methods is provided. PMID:22693578

  8. Busulfan, Fludarabine, Donor Stem Cell Transplant, and Cyclophosphamide in Treating Patients With Multiple Myeloma or Myelofibrosis

    ClinicalTrials.gov

    2018-01-31

    Anemia; ASXL1 Gene Mutation; EZH2 Gene Mutation; IDH1 Gene Mutation; IDH2 Gene Mutation; Plasma Cell Myeloma; Primary Myelofibrosis; Recurrent Plasma Cell Myeloma; Secondary Myelofibrosis; Thrombocytopenia

  9. Allen Brain Atlas-Driven Visualizations: a web-based gene expression energy visualization tool.

    PubMed

    Zaldivar, Andrew; Krichmar, Jeffrey L

    2014-01-01

    The Allen Brain Atlas-Driven Visualizations (ABADV) is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA) across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.

  10. EUGENE'HOM: A generic similarity-based gene finder using multiple homologous sequences.

    PubMed

    Foissac, Sylvain; Bardou, Philippe; Moisan, Annick; Cros, Marie-Josée; Schiex, Thomas

    2003-07-01

    EUGENE'HOM is a gene prediction software for eukaryotic organisms based on comparative analysis. EUGENE'HOM is able to take into account multiple homologous sequences from more or less closely related organisms. It integrates the results of TBLASTX analysis, splice site and start codon prediction and a robust coding/non-coding probabilistic model which allows EUGENE'HOM to handle sequences from a variety of organisms. The current target of EUGENE'HOM is plant sequences. The EUGENE'HOM web site is available at http://genopole.toulouse.inra.fr/bioinfo/eugene/EuGeneHom/cgi-bin/EuGeneHom.pl.

  11. Novel microbiological and spatial statistical methods to improve strength of epidemiological evidence in a community-wide waterborne outbreak.

    PubMed

    Jalava, Katri; Rintala, Hanna; Ollgren, Jukka; Maunula, Leena; Gomez-Alvarez, Vicente; Revez, Joana; Palander, Marja; Antikainen, Jenni; Kauppinen, Ari; Räsänen, Pia; Siponen, Sallamaari; Nyholm, Outi; Kyyhkynen, Aino; Hakkarainen, Sirpa; Merentie, Juhani; Pärnänen, Martti; Loginov, Raisa; Ryu, Hodon; Kuusi, Markku; Siitonen, Anja; Miettinen, Ilkka; Santo Domingo, Jorge W; Hänninen, Marja-Liisa; Pitkänen, Tarja

    2014-01-01

    Failures in the drinking water distribution system cause gastrointestinal outbreaks with multiple pathogens. A water distribution pipe breakage caused a community-wide waterborne outbreak in Vuorela, Finland, July 2012. We investigated this outbreak with advanced epidemiological and microbiological methods. A total of 473/2931 inhabitants (16%) responded to a web-based questionnaire. Water and patient samples were subjected to analysis of multiple microbial targets, molecular typing and microbial community analysis. Spatial analysis on the water distribution network was done and we applied a spatial logistic regression model. The course of the illness was mild. Drinking untreated tap water from the defined outbreak area was significantly associated with illness (RR 5.6, 95% CI 1.9-16.4) increasing in a dose response manner. The closer a person lived to the water distribution breakage point, the higher the risk of becoming ill. Sapovirus, enterovirus, single Campylobacter jejuni and EHEC O157:H7 findings as well as virulence genes for EPEC, EAEC and EHEC pathogroups were detected by molecular or culture methods from the faecal samples of the patients. EPEC, EAEC and EHEC virulence genes and faecal indicator bacteria were also detected in water samples. Microbial community sequencing of contaminated tap water revealed abundance of Arcobacter species. The polyphasic approach improved the understanding of the source of the infections, and aided to define the extent and magnitude of this outbreak.

  12. Ontological function annotation of long non-coding RNAs through hierarchical multi-label classification.

    PubMed

    Zhang, Jingpu; Zhang, Zuping; Wang, Zixiang; Liu, Yuting; Deng, Lei

    2018-05-15

    Long non-coding RNAs (lncRNAs) are an enormous collection of functional non-coding RNAs. Over the past decades, a large number of novel lncRNA genes have been identified. However, most of the lncRNAs remain function uncharacterized at present. Computational approaches provide a new insight to understand the potential functional implications of lncRNAs. Considering that each lncRNA may have multiple functions and a function may be further specialized into sub-functions, here we describe NeuraNetL2GO, a computational ontological function prediction approach for lncRNAs using hierarchical multi-label classification strategy based on multiple neural networks. The neural networks are incrementally trained level by level, each performing the prediction of gene ontology (GO) terms belonging to a given level. In NeuraNetL2GO, we use topological features of the lncRNA similarity network as the input of the neural networks and employ the output results to annotate the lncRNAs. We show that NeuraNetL2GO achieves the best performance and the overall advantage in maximum F-measure and coverage on the manually annotated lncRNA2GO-55 dataset compared to other state-of-the-art methods. The source code and data are available at http://denglab.org/NeuraNetL2GO/. leideng@csu.edu.cn. Supplementary data are available at Bioinformatics online.

  13. ePlant: Visualizing and Exploring Multiple Levels of Data for Hypothesis Generation in Plant Biology.

    PubMed

    Waese, Jamie; Fan, Jim; Pasha, Asher; Yu, Hans; Fucile, Geoffrey; Shi, Ruian; Cumming, Matthew; Kelley, Lawrence A; Sternberg, Michael J; Krishnakumar, Vivek; Ferlanti, Erik; Miller, Jason; Town, Chris; Stuerzlinger, Wolfgang; Provart, Nicholas J

    2017-08-01

    A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using separate tools. The high cognitive load required to navigate such a workflow is detrimental to hypothesis generation. Accordingly, there is a need for a robust research platform that incorporates all data and provides integrated search, analysis, and visualization features through a single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), a visual analytic tool for exploring multiple levels of Arabidopsis thaliana data through a zoomable user interface. ePlant connects to several publicly available web services to download genome, proteome, interactome, transcriptome, and 3D molecular structure data for one or more genes or gene products of interest. Data are displayed with a set of visualization tools that are presented using a conceptual hierarchy from big to small, and many of the tools combine information from more than one data type. We describe the development of ePlant in this article and present several examples illustrating its integrative features for hypothesis generation. We also describe the process of deploying ePlant as an "app" on Araport. Building on readily available web services, the code for ePlant is freely available for any other biological species research. © 2017 American Society of Plant Biologists. All rights reserved.

  14. Meta-Analysis of Multiple Sclerosis Microarray Data Reveals Dysregulation in RNA Splicing Regulatory Genes.

    PubMed

    Paraboschi, Elvezia Maria; Cardamone, Giulia; Rimoldi, Valeria; Gemmati, Donato; Spreafico, Marta; Duga, Stefano; Soldà, Giulia; Asselta, Rosanna

    2015-09-30

    Abnormalities in RNA metabolism and alternative splicing (AS) are emerging as important players in complex disease phenotypes. In particular, accumulating evidence suggests the existence of pathogenic links between multiple sclerosis (MS) and altered AS, including functional studies showing that an imbalance in alternatively-spliced isoforms may contribute to disease etiology. Here, we tested whether the altered expression of AS-related genes represents a MS-specific signature. A comprehensive comparative analysis of gene expression profiles of publicly-available microarray datasets (190 MS cases, 182 controls), followed by gene-ontology enrichment analysis, highlighted a significant enrichment for differentially-expressed genes involved in RNA metabolism/AS. In detail, a total of 17 genes were found to be differentially expressed in MS in multiple datasets, with CELF1 being dysregulated in five out of seven studies. We confirmed CELF1 downregulation in MS (p=0.0015) by real-time RT-PCRs on RNA extracted from blood cells of 30 cases and 30 controls. As a proof of concept, we experimentally verified the unbalance in alternatively-spliced isoforms in MS of the NFAT5 gene, a putative CELF1 target. In conclusion, for the first time we provide evidence of a consistent dysregulation of splicing-related genes in MS and we discuss its possible implications in modulating specific AS events in MS susceptibility genes.

  15. A Protocol for Multiple Gene Knockout in Mouse Small Intestinal Organoids Using a CRISPR-concatemer.

    PubMed

    Merenda, Alessandra; Andersson-Rolf, Amanda; Mustata, Roxana C; Li, Taibo; Kim, Hyunki; Koo, Bon-Kyoung

    2017-07-12

    CRISPR/Cas9 technology has greatly improved the feasibility and speed of loss-of-function studies that are essential in understanding gene function. In higher eukaryotes, paralogous genes can mask a potential phenotype by compensating the loss of a gene, thus limiting the information that can be obtained from genetic studies relying on single gene knockouts. We have developed a novel, rapid cloning method for guide RNA (gRNA) concatemers in order to create multi-gene knockouts following a single round of transfection in mouse small intestinal organoids. Our strategy allows for the concatemerization of up to four individual gRNAs into a single vector by performing a single Golden Gate shuffling reaction with annealed gRNA oligos and a pre-designed retroviral vector. This allows either the simultaneous knockout of up to four different genes, or increased knockout efficiency following the targeting of one gene by multiple gRNAs. In this protocol, we show in detail how to efficiently clone multiple gRNAs into the retroviral CRISPR-concatemer vector and how to achieve highly efficient electroporation in intestinal organoids. As an example, we show that simultaneous knockout of two pairs of genes encoding negative regulators of the Wnt signaling pathway (Axin1/2 and Rnf43/Znrf3) renders intestinal organoids resistant to the withdrawal of key growth factors.

  16. Hippo-YAP signaling pathway: A new paradigm for cancer therapy.

    PubMed

    Ma, Yanlei; Yang, Yongzhi; Wang, Feng; Wei, Qing; Qin, Huanlong

    2015-11-15

    In the past decades, the Hippo signaling pathway has been delineated and shown to play multiple roles in the control of organ size in both Drosophila and mammals. In mammals, the Hippo pathway is a kinase cascade leading from Mst1/2 to YAP and its paralog TAZ. Several studies have demonstrated that YAP/TAZ is a candidate oncogene and that other members of the Hippo pathway are tumor suppressive genes. The dysregulation of the Hippo pathway has been observed in a variety of cancers. This review chronicles the recent progress in elucidating the function of Hippo signaling in tumorigenesis and provide a rich source of potential targets for cancer therapy. © 2014 UICC.

  17. Micropropagation and in vitro conservation of vanilla (Vanilla planifolia Andrews).

    PubMed

    Divakaran, Minoo; Babu, K Nirmal

    2009-01-01

    Vanilla (Vanilla planifolia Andrews (syn. V. fragrans Salisb.), a source of natural vanillin, plays a major positive role in the economy of several countries. A native to the Central America, its primary gene pool is threatened by deforestation and over collection that has resulted in disappearance of natural habitats and wild species. Therefore, multiplication and conservation of vanilla diversity is of paramount importance because of its narrow genetic base. It plays an important role in the production of disease free planting material for commercial cultivation. Simple protocols for micropropagation, in vitro conservation and synthetic seed production are described in this chapter which could further be applied to other related vanilla species as well.

  18. Ignition probability of polymer-bonded explosives accounting for multiple sources of material stochasticity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, S.; Barua, A.; Zhou, M., E-mail: min.zhou@me.gatech.edu

    2014-05-07

    Accounting for the combined effect of multiple sources of stochasticity in material attributes, we develop an approach that computationally predicts the probability of ignition of polymer-bonded explosives (PBXs) under impact loading. The probabilistic nature of the specific ignition processes is assumed to arise from two sources of stochasticity. The first source involves random variations in material microstructural morphology; the second source involves random fluctuations in grain-binder interfacial bonding strength. The effect of the first source of stochasticity is analyzed with multiple sets of statistically similar microstructures and constant interfacial bonding strength. Subsequently, each of the microstructures in the multiple setsmore » is assigned multiple instantiations of randomly varying grain-binder interfacial strengths to analyze the effect of the second source of stochasticity. Critical hotspot size-temperature states reaching the threshold for ignition are calculated through finite element simulations that explicitly account for microstructure and bulk and interfacial dissipation to quantify the time to criticality (t{sub c}) of individual samples, allowing the probability distribution of the time to criticality that results from each source of stochastic variation for a material to be analyzed. Two probability superposition models are considered to combine the effects of the multiple sources of stochasticity. The first is a parallel and series combination model, and the second is a nested probability function model. Results show that the nested Weibull distribution provides an accurate description of the combined ignition probability. The approach developed here represents a general framework for analyzing the stochasticity in the material behavior that arises out of multiple types of uncertainty associated with the structure, design, synthesis and processing of materials.« less

  19. Regulation of Nitrogen Metabolism by GATA Zinc Finger Transcription Factors in Yarrowia lipolytica

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pomraning, Kyle R.; Bredeweg, Erin L.; Baker, Scott E.

    ABSTRACT Fungi accumulate lipids in a manner dependent on the quantity and quality of the nitrogen source on which they are growing. In the oleaginous yeastYarrowia lipolytica, growth on a complex source of nitrogen enables rapid growth and limited accumulation of neutral lipids, while growth on a simple nitrogen source promotes lipid accumulation in large lipid droplets. Here we examined the roles of nitrogen catabolite repression and its regulation by GATA zinc finger transcription factors on lipid metabolism inY. lipolytica. Deletion of the GATA transcription factor genesgzf3andgzf2resulted in nitrogen source-specific growth defects and greater accumulation of lipids when the cells weremore » growing on a simple nitrogen source. Deletion ofgzf1, which is most similar to activators of genes repressed by nitrogen catabolite repression in filamentous ascomycetes, did not affect growth on the nitrogen sources tested. We examined gene expression of wild-type and GATA transcription factor mutants on simple and complex nitrogen sources and found that expression of enzymes involved in malate metabolism, beta-oxidation, and ammonia utilization are strongly upregulated on a simple nitrogen source. Deletion ofgzf3results in overexpression of genes with GATAA sites in their promoters, suggesting that it acts as a repressor, whilegzf2is required for expression of ammonia utilization genes but does not grossly affect the transcription level of genes predicted to be controlled by nitrogen catabolite repression. Both GATA transcription factor mutants exhibit decreased expression of genes controlled by carbon catabolite repression via the repressormig1, including genes for beta-oxidation, highlighting the complex interplay between regulation of carbon, nitrogen, and lipid metabolism. IMPORTANCENitrogen source is commonly used to control lipid production in industrial fungi. Here we identified regulators of nitrogen catabolite repression in the oleaginous yeastY. lipolyticato determine how the nitrogen source regulates lipid metabolism. We show that disruption of both activators and repressors of nitrogen catabolite repression leads to increased lipid accumulation via activation of carbon catabolite repression through an as yet uncharacterized method.« less

  20. Yeast Interspecies Comparative Proteomics Reveals Divergence in Expression Profiles and Provides Insights into Proteome Resource Allocation and Evolutionary Roles of Gene Duplication*

    PubMed Central

    Kito, Keiji; Ito, Haruka; Nohara, Takehiro; Ohnishi, Mihoko; Ishibashi, Yuko; Takeda, Daisuke

    2016-01-01

    Omics analysis is a versatile approach for understanding the conservation and diversity of molecular systems across multiple taxa. In this study, we compared the proteome expression profiles of four yeast species (Saccharomyces cerevisiae, Saccharomyces mikatae, Kluyveromyces waltii, and Kluyveromyces lactis) grown on glucose- or glycerol-containing media. Conserved expression changes across all species were observed only for a small proportion of all proteins differentially expressed between the two growth conditions. Two Kluyveromyces species, both of which exhibited a high growth rate on glycerol, a nonfermentative carbon source, showed distinct species-specific expression profiles. In K. waltii grown on glycerol, proteins involved in the glyoxylate cycle and gluconeogenesis were expressed in high abundance. In K. lactis grown on glycerol, the expression of glycolytic and ethanol metabolic enzymes was unexpectedly low, whereas proteins involved in cytoplasmic translation, including ribosomal proteins and elongation factors, were highly expressed. These marked differences in the types of predominantly expressed proteins suggest that K. lactis optimizes the balance of proteome resource allocation between metabolism and protein synthesis giving priority to cellular growth. In S. cerevisiae, about 450 duplicate gene pairs were retained after whole-genome duplication. Intriguingly, we found that in the case of duplicates with conserved sequences, the total abundance of proteins encoded by a duplicate pair in S. cerevisiae was similar to that of protein encoded by nonduplicated ortholog in Kluyveromyces yeast. Given the frequency of haploinsufficiency, this observation suggests that conserved duplicate genes, even though minor cases of retained duplicates, do not exhibit a dosage effect in yeast, except for ribosomal proteins. Thus, comparative proteomic analyses across multiple species may reveal not only species-specific characteristics of metabolic processes under nonoptimal culture conditions but also provide valuable insights into intriguing biological principles, including the balance of proteome resource allocation and the role of gene duplication in evolutionary history. PMID:26560065

Top