Science.gov

Sample records for global gene mining

  1. Global gene mining and the pharmaceutical industry

    SciTech Connect

    Knudsen, Lisbeth E.

    2005-09-01

    Worldwide efforts are ongoing in optimizing medical treatment by searching for the right medicine at the right dose for the individual. Metabolism is regulated by polymorphisms, which may be tested by relatively simple SNP analysis, however requiring DNA from the test individuals. Target genes for the efficiency of a given medicine or predisposition of a given disease are also subject to population studies, e.g., in Iceland, Estonia, Sweden, etc. For hypothesis testing and generation, several bio-banks with samples from patients and healthy persons within the pharmaceutical industry have been established during the past 10 years. Thus, more than 100,000 samples are stored in the freezers of either the pharmaceutical companies or their contractual partners at universities and test institutions. Ethical issues related to data protection of the individuals providing samples to bio-banks are several: nature and extent of information prior to consent, coverage of the consent given by the study person, labeling and storage of the sample and data (coded or anonymized). In general, genetic test data, once obtained, are permanent and cannot be changed. The test data may imply information that is not beneficial to the patient and his/her family (e.g., employment opportunities, insurance, etc.). Furthermore, there may be a long latency between the analysis of the genetic test and the clinical expression of the disease and wide differences in the disease patterns. Consequently, information about some genetic test data may stigmatize patients leading to poor quality of life. This has raised the issue of 'genetic exceptionalism' justifying specific regulation of use of genetic information. Discussions on how to handle sampling and data are ongoing within the industry and the regulatory sphere, the European Agency for the Evaluation of Medicinal Products (EMEA) having issued a position paper, the Council for International Organizations of Medical Sciences (CIOMS) having a working

  2. Coal mine methane global review

    SciTech Connect

    2008-07-01

    This is the second edition of the Coal Mine Methane Global Overview, updated in the summer of 2008. This document contains individual, comprehensive profiles that characterize the coal and coal mine methane sectors of 33 countries - 22 methane to market partners and an additional 11 coal-producing nations. The executive summary provides summary tables that include statistics on coal reserves, coal production, methane emissions, and CMM projects activity. An International Coal Mine Methane Projects Database accompanies this overview. It contains more detailed and comprehensive information on over two hundred CMM recovery and utilization projects around the world. Project information in the database is updated regularly. This document will be updated annually. Suggestions for updates and revisions can be submitted to the Administrative Support Group and will be incorporate into the document as appropriate.

  3. Mining biological databases for candidate disease genes

    NASA Astrophysics Data System (ADS)

    Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.

    2001-07-01

    The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).

  4. Improving mine safety technology and training: establishing US global leadership

    SciTech Connect

    2006-12-15

    In 2006, the USA's record of mine safety was interrupted by fatalities that rocked the industry and caused the National Mining Association and its members to recommit to returning the US underground coal mining industry to a global mine safety leadership role. This report details a comprehensive approach to increase the odds of survival for miners in emergency situations and to create a culture of prevention of accidents. Among its 75 recommendations are a need to improve communications, mine rescue training, and escape and protection of miners. Section headings of the report are: Introduction; Review of mine emergency situations in the past 25 years: identifying and addressing the issues and complexities; Risk-based design and management; Communications technology; Escape and protection strategies; Emergency response and mine rescue procedures; Training for preparedness; Summary of recommendations; and Conclusions. 37 refs., 3 figs., 5 apps.

  5. Text Mining in Cancer Gene and Pathway Prioritization

    PubMed Central

    Luo, Yuan; Riedlinger, Gregory; Szolovits, Peter

    2014-01-01

    Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes. PMID:25392685

  6. Documenting the global impacts of beach sand mining

    NASA Astrophysics Data System (ADS)

    Young, R.; Griffith, A.

    2009-04-01

    For centuries, beach sand has been mined for use as aggregate in concrete, for heavy minerals, and for construction fill. The global extent and impact of this phenomenon has gone relatively unnoticed by academics, NGOs, and major news sources. Most reports of sand mining activities are found at the very local scale (if the mining is ever documented at all). Yet, sand mining in many localities has resulted in the complete destruction of beach (and related) ecosystems along with severe impacts to coastal protection and tourism. The Program for the Study of Developed Shorelines at Western Carolina University and Beachcare.org have initiated the construction of a global database of beach sand mining activities. The database is being built through a combination of site visits and through the data mining of media resources, peer reviewed papers, and reports from private and governmental entities. Currently, we have documented sand mining in 35 countries on 6 continents representing the removal of millions of cubic meters of sand. Problems extend from Asia where critical infrastructure has been disrupted by sand mining to the Caribbean where policy reform has swiftly followed a highly publicized theft of sand. The Program for the Study of Developed Shorelines recently observed extensive sand mining in Morocco at the regional scale. Tens of kilometers of beach have been stripped of sand and the mining continues southward reducing hope of a thriving tourism-based economy. Problems caused by beach sand mining include: destruction of natural beaches and the ecosystems they protect (e.g. dunes, wetlands), habitat loss for globally important species (e.g. turtles, shorebirds), destruction of nearshore marine ecosystems, increased shoreline erosion rates, reduced protection from storms, tsunamis, and wave events, and economic losses through tourist abandonment and loss of coastal aesthetics. The threats posed by sand mining are made even more critical given the prospect of a

  7. Study of global operational needs for mine clearance equipment

    NASA Astrophysics Data System (ADS)

    Blagden, Paddy M.

    2003-09-01

    The Geneva International Centre for Humanitarian Demining studied the needs of landmine clearance groups for equipment to carry out specific functions of mine clearance. This was done on a global level, and useful results were obtained, which will provide the basis for further analysis.

  8. Mining the genome for lipid genes.

    PubMed

    Kuivenhoven, Jan Albert; Hegele, Robert A

    2014-10-01

    Mining of the genome for lipid genes has since the early 1970s helped to shape our understanding of how triglycerides are packaged (in chylomicrons), repackaged (in very low density lipoproteins; VLDL), and hydrolyzed, and also how remnant and low-density lipoproteins (LDL) are cleared from the circulation. Gene discoveries have also provided insights into high-density lipoprotein (HDL) biogenesis and remodeling. Interestingly, at least half of these key molecular genetic studies were initiated with the benefit of prior knowledge of relevant proteins. In addition, multiple important findings originated from studies in mouse, and from other types of non-genetic approaches. Although it appears by now that the main lipid pathways have been uncovered, and that only modulators or adaptor proteins such as those encoded by LDLRAP1, APOA5, ANGPLT3/4, and PCSK9 are currently being discovered, genome wide association studies (GWAS) in particular have implicated many new loci based on statistical analyses; these may prove to have equally large impacts on lipoprotein traits as gene products that are already known. On the other hand, since 2004 - and particularly since 2010 when massively parallel sequencing has become de rigeur - no major new insights into genes governing lipid metabolism have been reported. This is probably because the etiologies of true Mendelian lipid disorders with overt clinical complications have been largely resolved. In the meantime, it has become clear that proving the importance of new candidate genes is challenging. This could be due to very low frequencies of large impact variants in the population. It must further be emphasized that functional genetic studies, while necessary, are often difficult to accomplish, making it hazardous to upgrade a variant that is simply associated to being definitively causative. Also, it is clear that applying a monogenic approach to dissect complex lipid traits that are mostly of polygenic origin is the wrong way to

  9. Gene prioritization and clustering by multi-view text mining

    PubMed Central

    2010-01-01

    Background Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. Results We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. Conclusions In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification. PMID:20074336

  10. An improved Pearson's correlation proximity-based hierarchical clustering for mining biological association between genes.

    PubMed

    Booma, P M; Prabhakaran, S; Dhanalakshmi, R

    2014-01-01

    Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality. PMID:25136661

  11. Integrative Data Mining Highlights Candidate Genes for Monogenic Myopathies

    PubMed Central

    Neto, Osorio Abath; Tassy, Olivier; Biancalana, Valérie; Zanoteli, Edmar; Pourquié, Olivier; Laporte, Jocelyn

    2014-01-01

    Inherited myopathies are a heterogeneous group of disabling disorders with still barely understood pathological mechanisms. Around 40% of afflicted patients remain without a molecular diagnosis after exclusion of known genes. The advent of high-throughput sequencing has opened avenues to the discovery of new implicated genes, but a working list of prioritized candidate genes is necessary to deal with the complexity of analyzing large-scale sequencing data. Here we used an integrative data mining strategy to analyze the genetic network linked to myopathies, derive specific signatures for inherited myopathy and related disorders, and identify and rank candidate genes for these groups. Training sets of genes were selected after literature review and used in Manteia, a public web-based data mining system, to extract disease group signatures in the form of enriched descriptor terms, which include functional annotation, human and mouse phenotypes, as well as biological pathways and protein interactions. These specific signatures were then used as an input to mine and rank candidate genes, followed by filtration against skeletal muscle expression and association with known diseases. Signatures and identified candidate genes highlight both potential common pathological mechanisms and allelic disease groups. Recent discoveries of gene associations to diseases, like B3GALNT2, GMPPB and B3GNT1 to congenital muscular dystrophies, were prioritized in the ranked lists, suggesting a posteriori validation of our approach and predictions. We show an example of how the ranked lists can be used to help analyze high-throughput sequencing data to identify candidate genes, and highlight the best candidate genes matching genomic regions linked to myopathies without known causative genes. This strategy can be automatized to generate fresh candidate gene lists, which help cope with database annotation updates as new knowledge is incorporated. PMID:25353622

  12. High precision global positioning system for mining applications

    SciTech Connect

    O`Grady, M.

    1997-12-01

    The author discusses today`s satellite technology that has lead to the development of a system that will increase safety and production in surface mining. The Department of Defense is maintaining a satellite system made up of 24 NavStar satellites that allow the use of their frequencies to position equipment anywhere on Earth. The previous satellite system was called the Transit system or Sat-Nav. It consisted of low-orbit satellites (not many up there) that ground-based receivers needed three days of logged data to process sub-meter accuracy positions. With the NavStar network of satellites, centimeter accuracy can be achieved within just a few minutes. Changes to the way one used to survey in the mining industry are being replaced with the Global Positioning System. It has proven to be a system that is more accurate and after the typical learning curve that is required by any new system, will lead to higher productivity; hence, financial rewards are in the immediate future.

  13. Gene Prioritization of Resistant Rice Gene against Xanthomas oryzae pv. oryzae by Using Text Mining Technologies

    PubMed Central

    Xia, Jingbo; Zhang, Xing; Yuan, Daojun; Chen, Lingling; Webster, Jonathan; Fang, Alex Chengyu

    2013-01-01

    To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization. PMID:24371834

  14. Systematic Association of Genes to Phenotypes by Genome and Literature Mining

    PubMed Central

    2005-01-01

    One of the major challenges of functional genomics is to unravel the connection between genotype and phenotype. So far no global analysis has attempted to explore those connections in the light of the large phenotypic variability seen in nature. Here, we use an unsupervised, systematic approach for associating genes and phenotypic characteristics that combines literature mining with comparative genome analysis. We first mine the MEDLINE literature database for terms that reflect phenotypic similarities of species. Subsequently we predict the likely genomic determinants: genes specifically present in the respective genomes. In a global analysis involving 92 prokaryotic genomes we retrieve 323 clusters containing a total of 2,700 significant gene–phenotype associations. Some clusters contain mostly known relationships, such as genes involved in motility or plant degradation, often with additional hypothetical proteins associated with those phenotypes. Other clusters comprise unexpected associations; for example, a group of terms related to food and spoilage is linked to genes predicted to be involved in bacterial food poisoning. Among the clusters, we observe an enrichment of pathogenicity-related associations, suggesting that the approach reveals many novel genes likely to play a role in infectious diseases. PMID:15799710

  15. OntoGene web services for biomedical text mining.

    PubMed

    Rinaldi, Fabio; Clematide, Simon; Marques, Hernani; Ellendorff, Tilia; Romacker, Martin; Rodriguez-Esteban, Raul

    2014-01-01

    Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an approach towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges,with top ranked results in several of them. PMID:25472638

  16. OntoGene web services for biomedical text mining

    PubMed Central

    2014-01-01

    Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an approach towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges, with top ranked results in several of them. PMID:25472638

  17. Computer aided gene mining for gingerol biosynthesis.

    PubMed

    James, Priyanka; Baby, Bincy; Charles, SonaSona; Nair, Lekshmysree Saraschandran; Nazeem, Puthiyaveetil Abdulla

    2015-01-01

    Inspite of the large body of genomic data obtained from the transcriptome of Zingiber officinale, very few studies have focused on the identification and characterization of miRNAs in gingerol biosynthesis. Zingiber officinale transcriptome was analyzed using EST dataset (38169 total) deposited in public domains. In this paper computational functional annotation of the available ESTs and identification of genes which play a significant role in gingerol biosynthesis are described. Zingiber officinale transcriptome was analyzed using EST dataset (38169 total) from ncbi. ESTs were clustered and assembled, resulting in 8624 contigs and 8821 singletons. Assembled dataset was then submitted to the EST functional annotation workflow including blast, gene ontology (go) analysis, and pathway enrichment by kyoto encyclopedia of genes and genomes (kegg) and interproscan. The unigene datasets were further exploited to identify simple sequence repeats that enable linkage mapping. A total of 409 simple sequence repeats were identified from the contigs. Furthermore we examined the existence of novel miRNAs from the ESTs in rhizome, root and leaf tissues. EST analysis revealed the presence of single hypothetical miRNA in rhizome tissue. The hypothetical miRNA is warranted to play an important role in controlling genes involved in gingerol biosynthesis and hence demands experimental validation. The assembly and associated information of transcriptome data provides a comprehensive functional and evolutionary characterization of genomics of Zingiber officinale. As an effort to make the genomic and transcriptomic data widely available to the public domain, the results were integrated into a web-based Ginger EST database which is freely accessible at http://www.kaubic.in/gingerest/. PMID:26229293

  18. Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

    PubMed Central

    Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417

  19. ESTIMATE OF GLOBAL METHANE EMISSIONS FROM COAL MINES

    EPA Science Inventory

    Country-specific emissions of methane (CH4) from underground coal mines, surface coal mines, and coal crushing and transport operations are estimated for 1989. Emissions for individual countries are estimated by using two sets of regression equations (R2 values range from 0.56 to...

  20. Global coordination in adaptation to gene rewiring

    PubMed Central

    Murakami, Yoshie; Matsumoto, Yuki; Tsuru, Saburo; Ying, Bei-Wen; Yomo, Tetsuya

    2015-01-01

    Gene rewiring is a common evolutionary phenomenon in nature that may lead to extinction for living organisms. Recent studies on synthetic biology demonstrate that cells can survive genetic rewiring. This survival (adaptation) is often linked to the stochastic expression of rewired genes with random transcriptional changes. However, the probability of adaptation and the underlying common principles are not clear. We performed a systematic survey of an assortment of gene-rewired Escherichia coli strains to address these questions. Three different cell fates, designated good survivors, poor survivors and failures, were observed when the strains starved. Large fluctuations in the expression of the rewired gene were commonly observed with increasing cell size, but these changes were insufficient for adaptation. Cooperative reorganizations in the corresponding operon and genome-wide gene expression largely contributed to the final success. Transcriptome reorganizations that generally showed high-dimensional dynamic changes were restricted within a one-dimensional trajectory for adaptation to gene rewiring, indicating a general path directed toward cellular plasticity for a successful cell fate. This finding of global coordination supports a mechanism of stochastic adaptation and provides novel insights into the design and application of complex genetic or metabolic networks. PMID:25564530

  1. Beegle: from literature mining to disease-gene discovery.

    PubMed

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/. PMID:26384564

  2. The Determination of Children's Knowledge of Global Lunar Patterns from Online Essays Using Text Mining Analysis

    ERIC Educational Resources Information Center

    Cheon, Jongpil; Lee, Sangno; Smith, Walter; Song, Jaeki; Kim, Yongjin

    2013-01-01

    The purpose of this study was to use text mining analysis of early adolescents' online essays to determine their knowledge of global lunar patterns. Australian and American students in grades five to seven wrote about global lunar patterns they had discovered by sharing observations with each other via the Internet. These essays were analyzed for…

  3. Mining the human gut microbiome for novel stress resistance genes

    PubMed Central

    Culligan, Eamonn P.; Marchesi, Julian R.; Hill, Colin; Sleator, Roy D.

    2012-01-01

    With the rapid advances in sequencing technologies in recent years, the human genome is now considered incomplete without the complementing microbiome, which outnumbers human genes by a factor of one hundred. The human microbiome, and more specifically the gut microbiome, has received considerable attention and research efforts over the past decade. Many studies have identified and quantified “who is there?,” while others have determined some of their functional capacity, or “what are they doing?” In a recent study, we identified novel salt-tolerance loci from the human gut microbiome using combined functional metagenomic and bioinformatics based approaches. Herein, we discuss the identified loci, their role in salt-tolerance and their importance in the context of the gut environment. We also consider the utility and power of functional metagenomics for mining such environments for novel genes and proteins, as well as the implications and possible applications for future research. PMID:22688726

  4. Carbon Nanomaterials Alter Global Gene Expression Profiles.

    PubMed

    Woodman, Sara; Short, John C W; McDermott, Hyoeun; Linan, Alexander; Bartlett, Katelyn; Gadila, Shiva Kumar Goud; Schmelzle, Katie; Wanekaya, Adam; Kim, Kyoungtae

    2016-05-01

    Carbon nanomaterials (CNMs), which include carbon nanotubes (CNTs) and their derivatives, have diverse technological and biomedical applications. The potential toxicity of CNMs to cells and tissues has become an important emerging question in nanotechnology. To assess the toxicity of CNTs and fullerenol C60(OH)24, we in the present work used the budding yeast Saccharomyces cerevisiae, one of the simplest eukaryotic organisms that share fundamental aspects of eukaryotic cell biology. We found that treatment with CNMs, regardless of their physical shape, negatively affected the growth rates, end-point cell densities and doubling times of CNM-exposed yeast cells when compared to unexposed cells. To investigate potential mechanisms behind the CNMs-induced growth defects, we performed RNA-Seq dependent transcriptional analysis and constructed global gene expression profiles of fullerenol C60(OH)24- and CNT-treated cells. When compared to non-treated control cells, CNM-treated cells displayed differential expression of genes whose functions are implicated in membrane transporters and stress response, although differentially expressed genes were not consistent between CNT- and fullerenol C60(OH)24-treated groups, leading to our conclusion that CNMs could serve as environmental toxic factors to eukaryotic cells. PMID:27483901

  5. Exploring the diversity of arsenic resistance genes from acid mine drainage microorganisms.

    PubMed

    Morgante, Verónica; Mirete, Salvador; de Figueras, Carolina G; Postigo Cacho, Marina; González-Pastor, José E

    2015-06-01

    The microbial communities from the Tinto River, a natural acid mine drainage environment, were explored to search for novel genes involved in arsenic resistance using a functional metagenomic approach. Seven pentavalent arsenate resistance clones were selected and analysed to find the genes responsible for this phenotype. Insights about their possible mechanisms of resistance were obtained from sequence similarities and cellular arsenic concentration. A total of 19 individual open reading frames were analysed, and each one was individually cloned and assayed for its ability to confer arsenic resistance in Escherichia coli cells. A total of 13 functionally active genes involved in arsenic resistance were identified, and they could be classified into different global processes: transport, stress response, DNA damage repair, phospholipids biosynthesis, amino acid biosynthesis and RNA-modifying enzymes. Most genes (11) encode proteins not previously related to heavy metal resistance or hypothetical or unknown proteins. On the other hand, two genes were previously related to heavy metal resistance in microorganisms. In addition, the ClpB chaperone and the RNA-modifying enzymes retrieved in this work were shown to increase the cell survival under different stress conditions (heat shock, acid pH and UV radiation). Thus, these results reveal novel insights about unidentified mechanisms of arsenic resistance. PMID:24801164

  6. Shift in Global Tantalum Mine Production, 2000–2014

    USGS Publications Warehouse

    Bleiwas, Donald I.; Papp, John F.; Yager, Thomas R.

    2015-01-01

    One of the activities of the U.S. Geological Survey National Minerals Information Center (USGS-NMIC) is to analyze global supply chains and characterize major components of mineral and material flows from ore extraction through processing to first tier products. These analyses support the core mission of the USGS-NMIC as the Federal entity responsible for the collection, analysis, and dissemination of objective, unbiased, factual information on minerals essential to the U.S. economy and national security.

  7. A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks

    PubMed Central

    2013-01-01

    Background The large amount of literature in the post-genomics era enables the study of gene interactions and networks using all available articles published for a specific organism. MeSH is a controlled vocabulary of medical and scientific terms that is used by biomedical scientists to manually index articles in the PubMed literature database. We hypothesized that genome-wide gene-MeSH term associations from the PubMed literature database could be used to predict implicit gene-to-gene relationships and networks. While the gene-MeSH associations have been used to detect gene-gene interactions in some studies, different methods have not been well compared, and such a strategy has not been evaluated for a genome-wide literature analysis. Genome-wide literature mining of gene-to-gene interactions allows ranking of the best gene interactions and investigation of comprehensive biological networks at a genome level. Results The genome-wide GenoMesh literature mining algorithm was developed by sequentially generating a gene-article matrix, a normalized gene-MeSH term matrix, and a gene-gene matrix. The gene-gene matrix relies on the calculation of pairwise gene dissimilarities based on gene-MeSH relationships. An optimized dissimilarity score was identified from six well-studied functions based on a receiver operating characteristic (ROC) analysis. Based on the studies with well-studied Escherichia coli and less-studied Brucella spp., GenoMesh was found to accurately identify gene functions using weighted MeSH terms, predict gene-gene interactions not reported in the literature, and cluster all the genes studied from an organism using the MeSH-based gene-gene matrix. A web-based GenoMesh literature mining program is also available at: http://genomesh.hegroup.org. GenoMesh also predicts gene interactions and networks among genes associated with specific MeSH terms or user-selected gene lists. Conclusions The GenoMesh algorithm and web program provide the first genome

  8. Global direct pressures on biodiversity by large-scale metal mining: Spatial distribution and implications for conservation.

    PubMed

    Murguía, Diego I; Bringezu, Stefan; Schaldach, Rüdiger

    2016-09-15

    Biodiversity loss is widely recognized as a serious global environmental change process. While large-scale metal mining activities do not belong to the top drivers of such change, these operations exert or may intensify pressures on biodiversity by adversely changing habitats, directly and indirectly, at local and regional scales. So far, analyses of global spatial dynamics of mining and its burden on biodiversity focused on the overlap between mines and protected areas or areas of high value for conservation. However, it is less clear how operating metal mines are globally exerting pressure on zones of different biodiversity richness; a similar gap exists for unmined but known mineral deposits. By using vascular plants' diversity as a proxy to quantify overall biodiversity, this study provides a first examination of the global spatial distribution of mines and deposits for five key metals across different biodiversity zones. The results indicate that mines and deposits are not randomly distributed, but concentrated within intermediate and high diversity zones, especially bauxite and silver. In contrast, iron, gold, and copper mines and deposits are closer to a more proportional distribution while showing a high concentration in the intermediate biodiversity zone. Considering the five metals together, 63% and 61% of available mines and deposits, respectively, are located in intermediate diversity zones, comprising 52% of the global land terrestrial surface. 23% of mines and 20% of ore deposits are located in areas of high plant diversity, covering 17% of the land. 13% of mines and 19% of deposits are in areas of low plant diversity, comprising 31% of the land surface. Thus, there seems to be potential for opening new mines in areas of low biodiversity in the future. PMID:27262340

  9. Global Gene Expression Analysis for the Assessment of Nanobiomaterials.

    PubMed

    Hanagata, Nobutaka

    2015-01-01

    Using global gene expression analysis, the effects of biomaterials and nanomaterials can be analyzed at the genetic level. Even though information obtained from global gene expression analysis can be useful for the evaluation and design of biomaterials and nanomaterials, its use for these purposes is not widespread. This is due to the difficulties involved in data analysis. Because the expression data of about 20,000 genes can be obtained at once with global gene expression analysis, the data must be analyzed using bioinformatics. A method of bioinformatic analysis called gene ontology can estimate the kinds of changes on cell functions caused by genes whose expression level is changed by biomaterials and nanomaterials. Also, by applying a statistical analysis technique called hierarchical clustering to global gene expression data between a variety of biomaterials, the effects of the properties of materials on cell functions can be estimated. In this chapter, these theories of analysis and examples of applications to nanomaterials and biomaterials are described. Furthermore, global microRNA analysis, a method that has gained attention in recent years, and its application to nanomaterials are introduced. PMID:26201278

  10. Global Analysis of Horizontal Gene Transfer in Fusarium verticillioides

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The co-occurrence of microbes within plants and other specialized niches may facilitate horizontal gene transfer (HGT) affecting host-pathogen interactions. We recently identified fungal-to-fungal HGTs involving metabolic gene clusters. For a global analysis of HGTs in the maize pathogen Fusarium ve...

  11. Automatic extraction of reference gene from literature in plants based on texting mining.

    PubMed

    He, Lin; Shen, Gengyu; Li, Fei; Huang, Shuiqing

    2015-01-01

    Real-Time Quantitative Polymerase Chain Reaction (qRT-PCR) is widely used in biological research. It is a key to the availability of qRT-PCR experiment to select a stable reference gene. However, selecting an appropriate reference gene usually requires strict biological experiment for verification with high cost in the process of selection. Scientific literatures have accumulated a lot of achievements on the selection of reference gene. Therefore, mining reference genes under specific experiment environments from literatures can provide quite reliable reference genes for similar qRT-PCR experiments with the advantages of reliability, economic and efficiency. An auxiliary reference gene discovery method from literature is proposed in this paper which integrated machine learning, natural language processing and text mining approaches. The validity tests showed that this new method has a better precision and recall on the extraction of reference genes and their environments. PMID:26510294

  12. Network-based prediction and knowledge mining of disease genes

    PubMed Central

    2015-01-01

    Background In recent years, high-throughput protein interaction identification methods have generated a large amount of data. When combined with the results from other in vivo and in vitro experiments, a complex set of relationships between biological molecules emerges. The growing popularity of network analysis and data mining has allowed researchers to recognize indirect connections between these molecules. Due to the interdependent nature of network entities, evaluating proteins in this context can reveal relationships that may not otherwise be evident. Methods We examined the human protein interaction network as it relates to human illness using the Disease Ontology. After calculating several topological metrics, we trained an alternating decision tree (ADTree) classifier to identify disease-associated proteins. Using a bootstrapping method, we created a tree to highlight conserved characteristics shared by many of these proteins. Subsequently, we reviewed a set of non-disease-associated proteins that were misclassified by the algorithm with high confidence and searched for evidence of a disease relationship. Results Our classifier was able to predict disease-related genes with 79% area under the receiver operating characteristic (ROC) curve (AUC), which indicates the tradeoff between sensitivity and specificity and is a good predictor of how a classifier will perform on future data sets. We found that a combination of several network characteristics including degree centrality, disease neighbor ratio, eccentricity, and neighborhood connectivity help to distinguish between disease- and non-disease-related proteins. Furthermore, the ADTree allowed us to understand which combinations of strongly predictive attributes contributed most to protein-disease classification. In our post-processing evaluation, we found several examples of potential novel disease-related proteins and corresponding literature evidence. In addition, we showed that first- and second

  13. Redox signaling: globalization of gene expression

    PubMed Central

    Oh, Jeong-Il; Kaplan, Samuel

    2000-01-01

    Here we show that the extent of electron flow through the cbb3 oxidase of Rhodobacter sphaeroides is inversely related to the expression levels of those photosynthesis genes that are under control of the PrrBA two-component activation system: the greater the electron flow, the stronger the inhibitory signal generated by the cbb3 oxidase to repress photosynthesis gene expression. Using site-directed mutagenesis, we show that intramolecular electron transfer within the cbb3 oxidase is involved in signal generation and transduction and this signal does not directly involve the intervention of molecular oxygen. In addition to the cbb3 oxidase, the redox state of the quinone pool controls the transcription rate of the puc operon via the AppA–PpsR antirepressor–repressor system. Together, these interacting regulatory circuits are depicted in a model that permits us to understand the regulation by oxygen and light of photosynthesis gene expression in R.sphaeroides. PMID:10944106

  14. Intrinsic limits to gene regulation by global crosstalk

    PubMed Central

    Friedlander, Tamar; Prizak, Roshan; Guet, Călin C.; Barton, Nicholas H.; Tkačik, Gašper

    2016-01-01

    Gene regulation relies on the specificity of transcription factor (TF)–DNA interactions. Limited specificity may lead to crosstalk: a regulatory state in which a gene is either incorrectly activated due to noncognate TF–DNA interactions or remains erroneously inactive. As each TF can have numerous interactions with noncognate cis-regulatory elements, crosstalk is inherently a global problem, yet has previously not been studied as such. We construct a theoretical framework to analyse the effects of global crosstalk on gene regulation. We find that crosstalk presents a significant challenge for organisms with low-specificity TFs, such as metazoans. Crosstalk is not easily mitigated by known regulatory schemes acting at equilibrium, including variants of cooperativity and combinatorial regulation. Our results suggest that crosstalk imposes a previously unexplored global constraint on the functioning and evolution of regulatory networks, which is qualitatively distinct from the known constraints that act at the level of individual gene regulatory elements. PMID:27489144

  15. Intrinsic limits to gene regulation by global crosstalk.

    PubMed

    Friedlander, Tamar; Prizak, Roshan; Guet, Călin C; Barton, Nicholas H; Tkačik, Gašper

    2016-01-01

    Gene regulation relies on the specificity of transcription factor (TF)-DNA interactions. Limited specificity may lead to crosstalk: a regulatory state in which a gene is either incorrectly activated due to noncognate TF-DNA interactions or remains erroneously inactive. As each TF can have numerous interactions with noncognate cis-regulatory elements, crosstalk is inherently a global problem, yet has previously not been studied as such. We construct a theoretical framework to analyse the effects of global crosstalk on gene regulation. We find that crosstalk presents a significant challenge for organisms with low-specificity TFs, such as metazoans. Crosstalk is not easily mitigated by known regulatory schemes acting at equilibrium, including variants of cooperativity and combinatorial regulation. Our results suggest that crosstalk imposes a previously unexplored global constraint on the functioning and evolution of regulatory networks, which is qualitatively distinct from the known constraints that act at the level of individual gene regulatory elements. PMID:27489144

  16. RANWAR: rank-based weighted association rule mining from gene expression and methylation data.

    PubMed

    Mallik, Saurav; Mukhopadhyay, Anirban; Maulik, Ujjwal

    2015-01-01

    Ranking of association rules is currently an interesting topic in data mining and bioinformatics. The huge number of evolved rules of items (or, genes) by association rule mining (ARM) algorithms makes confusion to the decision maker. In this article, we propose a weighted rule-mining technique (say, RANWAR or rank-based weighted association rule-mining) to rank the rules using two novel rule-interestingness measures, viz., rank-based weighted condensed support (wcs) and weighted condensed confidence (wcc) measures to bypass the problem. These measures are basically depended on the rank of items (genes). Using the rank, we assign weight to each item. RANWAR generates much less number of frequent itemsets than the state-of-the-art association rule mining algorithms. Thus, it saves time of execution of the algorithm. We run RANWAR on gene expression and methylation datasets. The genes of the top rules are biologically validated by Gene Ontologies (GOs) and KEGG pathway analyses. Many top ranked rules extracted from RANWAR that hold poor ranks in traditional Apriori, are highly biologically significant to the related diseases. Finally, the top rules evolved from RANWAR, that are not in Apriori, are reported. PMID:25265613

  17. The Determination of Children's Knowledge of Global Lunar Patterns from Online Essays Using Text Mining Analysis

    NASA Astrophysics Data System (ADS)

    Cheon, Jongpil; Lee, Sangno; Smith, Walter; Song, Jaeki; Kim, Yongjin

    2013-04-01

    The purpose of this study was to use text mining analysis of early adolescents' online essays to determine their knowledge of global lunar patterns. Australian and American students in grades five to seven wrote about global lunar patterns they had discovered by sharing observations with each other via the Internet. These essays were analyzed for the students' inclusion of words associated with the shape (i.e., phase), orientation and location of the Moon along with words about similarities and differences. Almost all students wrote about shape but fewer wrote about orientation or location. Students infrequently included words about similarities or differences in the same sentence with shape, orientation or location. Similar to studies about children's and adults' lunar misconceptions, it was found that male and female early adolescents also lacked a robust understanding of global lunar patterns.

  18. The Ocean as a Global Reservoir of Antibiotic Resistance Genes

    PubMed Central

    Hatosy, Stephen M.

    2015-01-01

    Recent studies of natural environments have revealed vast genetic reservoirs of antibiotic resistance (AR) genes. Soil bacteria and human pathogens share AR genes, and AR genes have been discovered in a variety of habitats. However, there is little knowledge about the presence and diversity of AR genes in marine environments and which organisms host AR genes. To address this, we identified the diversity of genes conferring resistance to ampicillin, tetracycline, nitrofurantoin, and sulfadimethoxine in diverse marine environments using functional metagenomics (the cloning and screening of random DNA fragments). Marine environments were host to a diversity of AR-conferring genes. Antibiotic-resistant clones were found at all sites, with 28% of the genes identified as known AR genes (encoding beta-lactamases, bicyclomycin resistance pumps, etc.). However, the majority of AR genes were not previously classified as such but had products similar to proteins such as transport pumps, oxidoreductases, and hydrolases. Furthermore, 44% of the genes conferring antibiotic resistance were found in abundant marine taxa (e.g., Pelagibacter, Prochlorococcus, and Vibrio). Therefore, we uncovered a previously unknown diversity of genes that conferred an AR phenotype among marine environments, which makes the ocean a global reservoir of both clinically relevant and potentially novel AR genes. PMID:26296734

  19. Global Gene Expression in Staphylococcus aureus Biofilms

    PubMed Central

    Beenken, Karen E.; Dunman, Paul M.; McAleese, Fionnuala; Macapagal, Daphne; Murphy, Ellen; Projan, Steven J.; Blevins, Jon S.; Smeltzer, Mark S.

    2004-01-01

    We previously demonstrated that mutation of the staphylococcal accessory regulator (sarA) in a clinical isolate of Staphylococcus aureus (UAMS-1) results in an impaired capacity to form a biofilm in vitro (K. E. Beenken, J. S. Blevins, and M. S. Smeltzer, Infect. Immun. 71:4206-4211, 2003). In this report, we used a murine model of catheter-based biofilm formation to demonstrate that a UAMS-1 sarA mutant also has a reduced capacity to form a biofilm in vivo. Surprisingly, mutation of the UAMS-1 ica locus had little impact on biofilm formation in vitro or in vivo. In an effort to identify additional loci that might be relevant to biofilm formation and/or the adaptive response required for persistence of S. aureus within a biofilm, we isolated total cellular RNA from UAMS-1 harvested from a biofilm grown in a flow cell and compared the transcriptional profile of this RNA to RNA isolated from both exponential- and stationary-phase planktonic cultures. Comparisons were done using a custom-made Affymetrix GeneChip representing the genomic complement of six strains of S. aureus (COL, N315, Mu50, NCTC 8325, EMRSA-16 [strain 252], and MSSA-476). The results confirm that the sessile lifestyle associated with persistence within a biofilm is distinct by comparison to the lifestyles of both the exponential and postexponential phases of planktonic culture. Indeed, we identified 48 genes in which expression was induced at least twofold in biofilms over expression under both planktonic conditions. Similarly, we identified 84 genes in which expression was repressed by a factor of at least 2 compared to expression under both planktonic conditions. A primary theme that emerged from the analysis of these genes is that persistence within a biofilm requires an adaptive response that limits the deleterious effects of the reduced pH associated with anaerobic growth conditions. PMID:15231800

  20. eGIFT: Mining Gene Information from the Literature

    PubMed Central

    2010-01-01

    Background With the biomedical literature continually expanding, searching PubMed for information about specific genes becomes increasingly difficult. Not only can thousands of results be returned, but gene name ambiguity leads to many irrelevant hits. As a result, it is difficult for life scientists and gene curators to rapidly get an overall picture about a specific gene from documents that mention its names and synonyms. Results In this paper, we present eGIFT (http://biotm.cis.udel.edu/eGIFT), a web-based tool that associates informative terms, called iTerms, and sentences containing them, with genes. To associate iTerms with a gene, eGIFT ranks iTerms about the gene, based on a score which compares the frequency of occurrence of a term in the gene's literature to its frequency of occurrence in documents about genes in general. To retrieve a gene's documents (Medline abstracts), eGIFT considers all gene names, aliases, and synonyms. Since many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene. Another additional filtering process is applied to retain those abstracts that focus on the gene rather than mention it in passing. eGIFT's information for a gene is pre-computed and users of eGIFT can search for genes by using a name or an EntrezGene identifier. iTerms are grouped into different categories to facilitate a quick inspection. eGIFT also links an iTerm to sentences mentioning the term to allow users to see the relation between the iTerm and the gene. We evaluated the precision and recall of eGIFT's iTerms for 40 genes; between 88% and 94% of the iTerms were marked as salient by our evaluators, and 94% of the UniProtKB keywords for these genes were also identified by eGIFT as iTerms. Conclusions Our evaluations suggest that iTerms capture highly-relevant aspects of genes. Furthermore, by showing sentences containing these terms, eGIFT can provide a quick description of a specific gene

  1. RiceGeneThresher: a web-based application for mining genes underlying QTL in rice genome.

    PubMed

    Thongjuea, Supat; Ruanjaichon, Vinitchan; Bruskiewich, Richard; Vanavichit, Apichart

    2009-01-01

    RiceGeneThresher is a public online resource for mining genes underlying genome regions of interest or quantitative trait loci (QTL) in rice genome. It is a compendium of rice genomic resources consisting of genetic markers, genome annotation, expressed sequence tags (ESTs), protein domains, gene ontology, plant stress-responsive genes, metabolic pathways and prediction of protein-protein interactions. RiceGeneThresher system integrates these diverse data sources and provides powerful web-based applications, and flexible tools for delivering customized set of biological data on rice. Its system supports whole-genome gene mining for QTL by querying using DNA marker intervals or genomic loci. RiceGeneThresher provides biologically supported evidences that are essential for targeting groups or networks of genes involved in controlling traits underlying QTL. Users can use it to discover and to assign the most promising candidate genes in preparation for the further gene function validation analysis. The web-based application is freely available at http://rice.kps.ku.ac.th. PMID:18820292

  2. Resistance Gene Mining in Wild and Cultivated Potato Germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A key long-term management strategy for combating potato diseases is to develop cultivars with high levels of resistance through identification and integration of major resistance (R) genes. This talk will summarize our results of cloning major R genes from potato germplasm using a candidate gene a...

  3. Biomedical Information Extraction: Mining Disease Associated Genes from Literature

    ERIC Educational Resources Information Center

    Huang, Zhong

    2014-01-01

    Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…

  4. Global demand for rare earth resources and strategies for green mining.

    PubMed

    Dutta, Tanushree; Kim, Ki-Hyun; Uchimiya, Minori; Kwon, Eilhann E; Jeon, Byong-Hun; Deep, Akash; Yun, Seong-Taek

    2016-10-01

    Rare earth elements (REEs) are essential raw materials for emerging renewable energy resources and 'smart' electronic devices. Global REE demand is slated to grow at an annual rate of 5% by 2020. This high growth rate will require a steady supply base of REEs in the long run. At present, China is responsible for 85% of global rare earth oxide (REO) production. To overcome this monopolistic supply situation, new strategies and investments are necessary to satisfy domestic supply demands. Concurrently, environmental, economic, and social problems arising from REE mining must be addressed. There is an urgent need to develop efficient REE recycling techniques from end-of-life products, technologies to minimize the amount of REEs required per unit device, and methods to recover them from fly ash or fossil fuel-burning wastes. PMID:27295408

  5. Gold Mining in the Peruvian Amazon: Global Prices, Deforestation, and Mercury Imports

    PubMed Central

    Swenson, Jennifer J.; Carter, Catherine E.; Domec, Jean-Christophe; Delgado, Cesar I.

    2011-01-01

    Many factors such as poverty, ineffective institutions and environmental regulations may prevent developing countries from managing how natural resources are extracted to meet a strong market demand. Extraction for some resources has reached such proportions that evidence is measurable from space. We present recent evidence of the global demand for a single commodity and the ecosystem destruction resulting from commodity extraction, recorded by satellites for one of the most biodiverse areas of the world. We find that since 2003, recent mining deforestation in Madre de Dios, Peru is increasing nonlinearly alongside a constant annual rate of increase in international gold price (∼18%/yr). We detect that the new pattern of mining deforestation (1915 ha/year, 2006–2009) is outpacing that of nearby settlement deforestation. We show that gold price is linked with exponential increases in Peruvian national mercury imports over time (R2 = 0.93, p = 0.04, 2003–2009). Given the past rates of increase we predict that mercury imports may more than double for 2011 (∼500 t/year). Virtually all of Peru's mercury imports are used in artisanal gold mining. Much of the mining increase is unregulated/artisanal in nature, lacking environmental impact analysis or miner education. As a result, large quantities of mercury are being released into the atmosphere, sediments and waterways. Other developing countries endowed with gold deposits are likely experiencing similar environmental destruction in response to recent record high gold prices. The increasing availability of satellite imagery ought to evoke further studies linking economic variables with land use and cover changes on the ground. PMID:21526143

  6. Global and gene specific DNA methylation changes during zebrafish development

    Technology Transfer Automated Retrieval System (TEKTRAN)

    DNA methylation is dynamic through the life of an organism. In this study, we measured the global and gene specific DNA methylation changes in zebrafish at different developmental stages. We found that the methylation percentage of cytosines was 11.75 ± 0.96% in 3.3 hour post fertilization (hpf) zeb...

  7. miRTex: A Text Mining System for miRNA-Gene Relation Extraction

    PubMed Central

    Li, Gang; Ross, Karen E.; Arighi, Cecilia N.; Peng, Yifan; Wu, Cathy H.; Vijay-Shanker, K.

    2015-01-01

    MicroRNAs (miRNAs) regulate a wide range of cellular and developmental processes through gene expression suppression or mRNA degradation. Experimentally validated miRNA gene targets are often reported in the literature. In this paper, we describe miRTex, a text mining system that extracts miRNA-target relations, as well as miRNA-gene and gene-miRNA regulation relations. The system achieves good precision and recall when evaluated on a literature corpus of 150 abstracts with F-scores close to 0.90 on the three different types of relations. We conducted full-scale text mining using miRTex to process all the Medline abstracts and all the full-length articles in the PubMed Central Open Access Subset. The results for all the Medline abstracts are stored in a database for interactive query and file download via the website at http://proteininformationresource.org/mirtex. Using miRTex, we identified genes potentially regulated by miRNAs in Triple Negative Breast Cancer, as well as miRNA-gene relations that, in conjunction with kinase-substrate relations, regulate the response to abiotic stress in Arabidopsis thaliana. These two use cases demonstrate the usefulness of miRTex text mining in the analysis of miRNA-regulated biological processes. PMID:26407127

  8. A high-resolution network model for global gene regulation in Mycobacterium tuberculosis

    PubMed Central

    Peterson, Eliza J.R.; Reiss, David J.; Turkarslan, Serdar; Minch, Kyle J.; Rustad, Tige; Plaisier, Christopher L.; Longabaugh, William J.R.; Sherman, David R.; Baliga, Nitin S.

    2014-01-01

    The resilience of Mycobacterium tuberculosis (MTB) is largely due to its ability to effectively counteract and even take advantage of the hostile environments of a host. In order to accelerate the discovery and characterization of these adaptive mechanisms, we have mined a compendium of 2325 publicly available transcriptome profiles of MTB to decipher a predictive, systems-scale gene regulatory network model. The resulting modular organization of 98% of all MTB genes within this regulatory network was rigorously tested using two independently generated datasets: a genome-wide map of 7248 DNA-binding locations for 143 transcription factors (TFs) and global transcriptional consequences of overexpressing 206 TFs. This analysis has discovered specific TFs that mediate conditional co-regulation of genes within 240 modules across 14 distinct environmental contexts. In addition to recapitulating previously characterized regulons, we discovered 454 novel mechanisms for gene regulation during stress, cholesterol utilization and dormancy. Significantly, 183 of these mechanisms act uniquely under conditions experienced during the infection cycle to regulate diverse functions including 23 genes that are essential to host-pathogen interactions. These and other insights underscore the power of a rational, model-driven approach to unearth novel MTB biology that operates under some but not all phases of infection. PMID:25232098

  9. A high-resolution network model for global gene regulation in Mycobacterium tuberculosis.

    PubMed

    Peterson, Eliza J R; Reiss, David J; Turkarslan, Serdar; Minch, Kyle J; Rustad, Tige; Plaisier, Christopher L; Longabaugh, William J R; Sherman, David R; Baliga, Nitin S

    2014-10-01

    The resilience of Mycobacterium tuberculosis (MTB) is largely due to its ability to effectively counteract and even take advantage of the hostile environments of a host. In order to accelerate the discovery and characterization of these adaptive mechanisms, we have mined a compendium of 2325 publicly available transcriptome profiles of MTB to decipher a predictive, systems-scale gene regulatory network model. The resulting modular organization of 98% of all MTB genes within this regulatory network was rigorously tested using two independently generated datasets: a genome-wide map of 7248 DNA-binding locations for 143 transcription factors (TFs) and global transcriptional consequences of overexpressing 206 TFs. This analysis has discovered specific TFs that mediate conditional co-regulation of genes within 240 modules across 14 distinct environmental contexts. In addition to recapitulating previously characterized regulons, we discovered 454 novel mechanisms for gene regulation during stress, cholesterol utilization and dormancy. Significantly, 183 of these mechanisms act uniquely under conditions experienced during the infection cycle to regulate diverse functions including 23 genes that are essential to host-pathogen interactions. These and other insights underscore the power of a rational, model-driven approach to unearth novel MTB biology that operates under some but not all phases of infection. PMID:25232098

  10. Data mining of VDJ genes reveals interesting clues.

    PubMed

    Joshi, Rajani R; Gupta, Vinay K

    2006-01-01

    Hypervariability of the complementary determining regions in characteristic structure of Immunoglobulins and the distinct, cell-specific expressions of the genes coding for this important class of proteins pose intriguing problems in experimental and computational/informatics research requiring a special approach different from those for the other proteins. We present here an Average Linkage Hierarchical Clustering of the Homosapien VDJ genes and the Immunoglobulin polypeptides generated by them using special kind of data structures and correlation matrices in place of the microarray data. The results reveal interesting clues on the heterogeneity of exon - intron locations in these gene-families and its possible role in hypervariability of the Immunoglobulins. PMID:16842114

  11. Global regulation of Staphylococcus aureus genes by Rot.

    PubMed

    Saïd-Salim, B; Dunman, P M; McAleese, F M; Macapagal, D; Murphy, E; McNamara, P J; Arvidson, S; Foster, T J; Projan, S J; Kreiswirth, B N

    2003-01-01

    Staphylococcus aureus produces a wide array of cell surface and extracellular proteins involved in virulence. Expression of these virulence factors is tightly controlled by numerous regulatory loci, including agr, sar, sigB, sae, and arl, as well as by a number of proteins with homology to SarA. Rot (repressor of toxins), a SarA homologue, was previously identified in a library of transposon-induced mutants created in an agr-negative strain by screening for restored protease and alpha-toxin. To date, all of the SarA homologues have been shown to act as global regulators of virulence genes. Therefore, we investigated the extent of transcriptional regulation of staphylococcal genes by Rot. We compared the transcriptional profile of a rot agr double mutant to that of its agr parental strain by using custom-made Affymetrix GeneChips. Our findings indicate that Rot is not only a repressor but a global regulator with both positive and negative effects on the expression of S. aureus genes. Our data also indicate that Rot and agr have opposing effects on select target genes. These results provide further insight into the role of Rot in the regulatory cascade of S. aureus virulence gene expression. PMID:12511508

  12. DISEASES: text mining and data integration of disease-gene associations.

    PubMed

    Pletscher-Frankild, Sune; Pallejà, Albert; Tsafou, Kalliopi; Binder, Janos X; Jensen, Lars Juhl

    2015-03-01

    Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease-gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease-gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download. PMID:25484339

  13. YAGM: a web tool for mining associated genes in yeast based on diverse biological associations

    PubMed Central

    2015-01-01

    Background Investigating association between genes can be used in understanding the relations of genes in biological processes. STRING and GeneMANIA are two well-known web tools which can provide a list of associated genes of a query gene based on diverse biological associations such as co-expression, co-localization, co-citation and so on. However, the transcriptional regulation association and mutant phenotype association have not been used in these two web tools. Since the comprehensive transcription factor (TF)-gene binding data, TF-gene regulation data and mutant phenotype data are available in yeast, we developed a web tool called YAGM (Yeast Associated Genes Miner) which constructed the transcriptional regulation association, mutant phenotype association and five commonly used biological associations to mine a list of associated genes of a query yeast gene. Description In YAGM, we collected seven kinds of datasets including TF-gene binding (TFB) data, TF-gene regulation (TFR) data, mutant phenotype (MP) data, functional annotation (FA) data, physical interaction (PI) data, genetic interaction (GI) data, and literature evidence (LE) data. Then by using the hypergeometric test to calculate the association scores of all gene pairs in yeast, we constructed seven biological associations including two transcriptional regulation associations (TFB association and TFR association), MP association, FA association, PI association, GI association, and LE association. Moreover, the expression profile association from SPELL database was also included in YAGM. When using YAGM, users can input a query gene and choose any possible subsets of the eight biological associations, then a list of associated genes of the query gene will be returned based on the chosen biological associations. Conclusions In this study, we presented the YAGM which provides eight biological associations for mining associated genes of a query gene in yeast. Among the eight biological associations

  14. Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles.

    PubMed

    Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko

    2016-06-01

    Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. PMID:27064123

  15. Soil metatranscriptomics for mining eukaryotic heavy metal resistance genes.

    PubMed

    Lehembre, Frédéric; Doillon, Didier; David, Elise; Perrotto, Sandrine; Baude, Jessica; Foulon, Julie; Harfouche, Lamia; Vallon, Laurent; Poulain, Julie; Da Silva, Corinne; Wincker, Patrick; Oger-Desfeux, Christine; Richaud, Pierre; Colpaert, Jan V; Chalot, Michel; Fraissinet-Tachet, Laurence; Blaudez, Damien; Marmeisse, Roland

    2013-10-01

    Heavy metals are pollutants which affect all organisms. Since a small number of eukaryotes have been investigated with respect to metal resistance, we hypothesize that many genes that control this phenomenon remain to be identified. This was tested by screening soil eukaryotic metatranscriptomes which encompass RNA from organisms belonging to the main eukaryotic phyla. Soil-extracted polyadenylated mRNAs were converted into cDNAs and 35 of them were selected for their ability to rescue the metal (Cd or Zn) sensitive phenotype of yeast mutants. Few of the genes belonged to families known to confer metal resistance when overexpressed in yeast. Several of them were homologous to genes that had not been studied in the context of metal resistance. For instance, the BOLA ones, which conferred cross metal (Zn, Co, Cd, Mn) resistance may act by interfering with Fe homeostasis. Other genes, such as those encoding 110- to 130-amino-acid-long, cysteine-rich polypeptides, had no homologues in databases. This study confirms that functional metatranscriptomics represents a powerful approach to address basic biological processes in eukaryotes. The selected genes can be used to probe new pathways involved in metal homeostasis and to manipulate the resistance level of selected organisms. PMID:23663419

  16. Study of Lateral Gene Transfer in an Acid Mine Drainage Community Enabled by Comparative Genomics

    NASA Astrophysics Data System (ADS)

    Hugenholtz, P.; Croft, L.; Tyson, G. W.; Baker, B. J.; Detter, C.; Richardson, P. M.; Banfield, J. F.

    2002-12-01

    Lateral gene transfer (LGT) is thought to play a crucial role in the ecology and evolution of prokaryotes. We are investigating the role of LGT in an acid mine drainage community hosted in a pyrite-dominated metal sulfide deposit at the Richmond mine at Iron Mountain, CA. Due to biologically-mediated pyrite dissolution, the prevailing conditions within the mine are extremely low pH (< 1.0), very high ionic concentrations (molar concentrations of iron sulfate and mM concentrations of arsenic, copper and zinc), and moderate to high temperatures (30 to >50 C). These conditions are thought to largely isolate the community from potential external gene donors since naked DNA, phage and prokaryotes native to neutral pH habitats do not persist at pH <1.0 precluding an external influx of genes by transformation, transduction and conjugation, respectively. Microbial communities exist in several distinct habitats within Richmond mine including biofilms (subaqueous slime streamers and subaerial slimes) and cells attached directly to pyrite granules. This, however, belies an unusual simplicity in community composition. All communities investigated to date comprise only a handful of phylogenetically distinct organisms, typically dominated by the iron-oxidizing genera Leptospirillum and Ferroplasma. We have undertaken a community genomics analysis of a subaerial biofilm dominated by a Leptospirillum population to facilitate the study of LGT in this type of environment. The genome of Ferroplasma acidarmanus fer1, a minor component of the target community (but a major component of other Richmond mine communities), has been sequenced. Comparative genome analyses indicate that F. acidarmanus and the ancestor of two acidophilic Thermoplasma species belonging to the Euryarchaeota have traded many genes with phylogenetically remote acidophilic Sulfolobus species (Crenarchaeota). The putatively transferred sets of Sulfolobus genes in Ferroplasma and the Thermoplasma ancestor are distinct

  17. The Gene Expression Barcode 3.0: improved data processing and mining tools

    PubMed Central

    McCall, Matthew N.; Jaffee, Harris A.; Zelisko, Susan J.; Sinha, Neeraj; Hooiveld, Guido; Irizarry, Rafael A.; Zilliox, Michael J.

    2014-01-01

    The Gene Expression Barcode project, http://barcode.luhs.org, seeks to determine the genes expressed for every tissue and cell type in humans and mice. Understanding the absolute expression of genes across tissues and cell types has applications in basic cell biology, hypothesis generation for gene function and clinical predictions using gene expression signatures. In its current version, this project uses the abundant publicly available microarray data sets combined with a suite of single-array preprocessing, quality control and analysis methods. In this article, we present the improvements that have been made since the previous version of the Gene Expression Barcode in 2011. These include a variety of new data mining tools and summaries, estimated transcriptomes and curated annotations. PMID:24271388

  18. Implications for global climate change from microbially-produced acid mine drainage

    NASA Astrophysics Data System (ADS)

    Norlund, K. L.; Hitchcock, A. P.; Warren, L. A.

    2009-05-01

    Microbial catalysis of sulphur cycling in acid mine drainage (AMD) environments is well known but the reaction pathways are poorly characterised. These reaction pathways involve both acid-consuming and acid- generating steps, with important consequences for overall AMD production as well as sulphur and carbon global biogeochemical cycles. Mining-associated sulphuric acid has been implicated in climate change through the weathering of carbonate minerals resulting in the release of 29 Tg C/year as carbon dioxide. Understanding of microbial AMD generation is based predominantly on studies of Acidithiobacillus ferrooxidans despite the knowledge that other environmentally common strains of bacteria are also active sulphur oxidizers and that microbial consortia are likely very important in environmental processes. Using an integrated experimental approach including geochemical experimentation, scanning transmission X-ray microscopy (STXM) and fluorescent in situ hybridization (FISH), we document a novel syntrophic sulphur metabolism involving two common mine bacteria: autotrophic sulphur oxidizing Acidithiobacillus ferrooxidans and heterotrophic Acidiphilium spp. The proposed sulphur geochemistry associated with this bacterial consortium produces 40-90% less acid than expected based on abiotic AMD models, with significant implications for both AMD mitigation and AMD carbon flux modelling. The two bacterial strains are specifically spatially segregated within a macrostructure of extracellular polymeric substance (EPS) that provides the necessary microgeochemical conditions for coupled sulphur oxidation and reduction reactions. STXM results identify multiple sulphur oxidation states associated with the pods, indicating that they are the sites of active sulphur disproportionation and recycling. Recent laboratory experimentation using type culture strains of the bacteria involved in pod-formation suggesting that this phenomenon is likely to be widespread in environments

  19. DGIdb 2.0: mining clinically relevant drug-gene interactions.

    PubMed

    Wagner, Alex H; Coffman, Adam C; Ainscough, Benjamin J; Spies, Nicholas C; Skidmore, Zachary L; Campbell, Katie M; Krysiak, Kilannin; Pan, Deng; McMichael, Joshua F; Eldred, James M; Walker, Jason R; Wilson, Richard K; Mardis, Elaine R; Griffith, Malachi; Griffith, Obi L

    2016-01-01

    The Drug-Gene Interaction Database (DGIdb, www.dgidb.org) is a web resource that consolidates disparate data sources describing drug-gene interactions and gene druggability. It provides an intuitive graphical user interface and a documented application programming interface (API) for querying these data. DGIdb was assembled through an extensive manual curation effort, reflecting the combined information of twenty-seven sources. For DGIdb 2.0, substantial updates have been made to increase content and improve its usefulness as a resource for mining clinically actionable drug targets. Specifically, nine new sources of drug-gene interactions have been added, including seven resources specifically focused on interactions linked to clinical trials. These additions have more than doubled the overall count of drug-gene interactions. The total number of druggable gene claims has also increased by 30%. Importantly, a majority of the unrestricted, publicly-accessible sources used in DGIdb are now automatically updated on a weekly basis, providing the most current information for these sources. Finally, a new web view and API have been developed to allow searching for interactions by drug identifiers to complement existing gene-based search functionality. With these updates, DGIdb represents a comprehensive and user friendly tool for mining the druggable genome for precision medicine hypothesis generation. PMID:26531824

  20. DGIdb 2.0: mining clinically relevant drug–gene interactions

    PubMed Central

    Wagner, Alex H.; Coffman, Adam C.; Ainscough, Benjamin J.; Spies, Nicholas C.; Skidmore, Zachary L.; Campbell, Katie M.; Krysiak, Kilannin; Pan, Deng; McMichael, Joshua F.; Eldred, James M.; Walker, Jason R.; Wilson, Richard K.; Mardis, Elaine R.; Griffith, Malachi; Griffith, Obi L.

    2016-01-01

    The Drug–Gene Interaction Database (DGIdb, www.dgidb.org) is a web resource that consolidates disparate data sources describing drug–gene interactions and gene druggability. It provides an intuitive graphical user interface and a documented application programming interface (API) for querying these data. DGIdb was assembled through an extensive manual curation effort, reflecting the combined information of twenty-seven sources. For DGIdb 2.0, substantial updates have been made to increase content and improve its usefulness as a resource for mining clinically actionable drug targets. Specifically, nine new sources of drug–gene interactions have been added, including seven resources specifically focused on interactions linked to clinical trials. These additions have more than doubled the overall count of drug–gene interactions. The total number of druggable gene claims has also increased by 30%. Importantly, a majority of the unrestricted, publicly-accessible sources used in DGIdb are now automatically updated on a weekly basis, providing the most current information for these sources. Finally, a new web view and API have been developed to allow searching for interactions by drug identifiers to complement existing gene-based search functionality. With these updates, DGIdb represents a comprehensive and user friendly tool for mining the druggable genome for precision medicine hypothesis generation. PMID:26531824

  1. DDMGD: the database of text-mined associations between genes methylated in diseases from different species.

    PubMed

    Bin Raies, Arwa; Mansour, Hicham; Incitti, Roberto; Bajic, Vladimir B

    2015-01-01

    Gathering information about associations between methylated genes and diseases is important for diseases diagnosis and treatment decisions. Recent advancements in epigenetics research allow for large-scale discoveries of associations of genes methylated in diseases in different species. Searching manually for such information is not easy, as it is scattered across a large number of electronic publications and repositories. Therefore, we developed DDMGD database (http://www.cbrc.kaust.edu.sa/ddmgd/) to provide a comprehensive repository of information related to genes methylated in diseases that can be found through text mining. DDMGD's scope is not limited to a particular group of genes, diseases or species. Using the text mining system DEMGD we developed earlier and additional post-processing, we extracted associations of genes methylated in different diseases from PubMed Central articles and PubMed abstracts. The accuracy of extracted associations is 82% as estimated on 2500 hand-curated entries. DDMGD provides a user-friendly interface facilitating retrieval of these associations ranked according to confidence scores. Submission of new associations to DDMGD is provided. A comparison analysis of DDMGD with several other databases focused on genes methylated in diseases shows that DDMGD is comprehensive and includes most of the recent information on genes methylated in diseases. PMID:25398897

  2. Literature Mining and Ontology based Analysis of Host-Brucella Gene–Gene Interaction Network

    PubMed Central

    Karadeniz, İlknur; Hur, Junguk; He, Yongqun; Özgür, Arzucan

    2015-01-01

    Brucella is an intracellular bacterium that causes chronic brucellosis in humans and various mammals. The identification of host-Brucella interaction is crucial to understand host immunity against Brucella infection and Brucella pathogenesis against host immune responses. Most of the information about the inter-species interactions between host and Brucella genes is only available in the text of the scientific publications. Many text-mining systems for extracting gene and protein interactions have been proposed. However, only a few of them have been designed by considering the peculiarities of host–pathogen interactions. In this paper, we used a text mining approach for extracting host-Brucella gene–gene interactions from the abstracts of articles in PubMed. The gene–gene interactions here represent the interactions between genes and/or gene products (e.g., proteins). The SciMiner tool, originally designed for detecting mammalian gene/protein names in text, was extended to identify host and Brucella gene/protein names in the abstracts. Next, sentence-level and abstract-level co-occurrence based approaches, as well as sentence-level machine learning based methods, originally designed for extracting intra-species gene interactions, were utilized to extract the interactions among the identified host and Brucella genes. The extracted interactions were manually evaluated. A total of 46 host-Brucella gene interactions were identified and represented as an interaction network. Twenty four of these interactions were identified from sentence-level processing. Twenty two additional interactions were identified when abstract-level processing was performed. The Interaction Network Ontology (INO) was used to represent the identified interaction types at a hierarchical ontology structure. Ontological modeling of specific gene–gene interactions demonstrates that host–pathogen gene–gene interactions occur at experimental conditions which can be ontologically

  3. Data mining as a discovery tool for imprinted genes.

    PubMed

    Brideau, Chelsea; Soloway, Paul

    2012-01-01

    This chapter serves as an introduction to the collection of genome-wide sequence and epigenomic data, as well as the use of these data in training generalized linear models (glm) to predicted imprinted status. This is meant to be an introduction to the method, so only the most straightforward examples will be covered. For instance, the examples given below refer to 11 classes of genomic regions (the entire gene body, introns, exons, 5' UTR, 3' UTR, and 1, 10, and 100 kb upstream and downstream of each gene). One could also build models based on combinations of these regions. Likewise, models could be built on combinations of epigenetic features, or on combinations of both genomic regions and epigenetic features.This chapter relies heavily on computational methods, including basic programming. However, this chapter is not meant to be an introduction to programming. Throughout the chapter, the reader will be provided with example code in the Perl programming language. PMID:22907493

  4. Global Patterns of Diversity and Selection in Human Tyrosinase Gene

    PubMed Central

    Hudjashov, Georgi; Villems, Richard; Kivisild, Toomas

    2013-01-01

    Global variation in skin pigmentation is one of the most striking examples of environmental adaptation in humans. More than two hundred loci have been identified as candidate genes in model organisms and a few tens of these have been found to be significantly associated with human skin pigmentation in genome-wide association studies. However, the evolutionary history of different pigmentation genes is rather complex: some loci have been subjected to strong positive selection, while others evolved under the relaxation of functional constraints in low UV environment. Here we report the results of a global study of the human tyrosinase gene, which is one of the key enzymes in melanin production, to assess the role of its variation in the evolution of skin pigmentation differences among human populations. We observe a higher rate of non-synonymous polymorphisms in the European sample consistent with the relaxation of selective constraints. A similar pattern was previously observed in the MC1R gene and concurs with UV radiation-driven model of skin color evolution by which mutations leading to lower melanin levels and decreased photoprotection are subject to purifying selection at low latitudes while being tolerated or even favored at higher latitudes because they facilitate UV-dependent vitamin D production. Our coalescent date estimates suggest that the non-synonymous variants, which are frequent in Europe and North Africa, are recent and have emerged after the separation of East and West Eurasian populations. PMID:24040225

  5. Bacteria and Genes Involved in Arsenic Speciation in Sediment Impacted by Long-Term Gold Mining

    PubMed Central

    Costa, Patrícia S.; Scholte, Larissa L. S.; Reis, Mariana P.; Chaves, Anderson V.; Oliveira, Pollyanna L.; Itabayana, Luiza B.; Suhadolnik, Maria Luiza S.; Barbosa, Francisco A. R.; Chartone-Souza, Edmar; Nascimento, Andréa M. A.

    2014-01-01

    The bacterial community and genes involved in geobiocycling of arsenic (As) from sediment impacted by long-term gold mining were characterized through culture-based analysis of As-transforming bacteria and metagenomic studies of the arsC, arrA, and aioA genes. Sediment was collected from the historically gold mining impacted Mina stream, located in one of the world’s largest mining regions known as the “Iron Quadrangle”. A total of 123 As-resistant bacteria were recovered from the enrichment cultures, which were phenotypically and genotypically characterized for As-transformation. A diverse As-resistant bacteria community was found through phylogenetic analyses of the 16S rRNA gene. Bacterial isolates were affiliated with Proteobacteria, Firmicutes, and Actinobacteria and were represented by 20 genera. Most were AsV-reducing (72%), whereas AsIII-oxidizing accounted for 20%. Bacteria harboring the arsC gene predominated (85%), followed by aioA (20%) and arrA (7%). Additionally, we identified two novel As-transforming genera, Thermomonas and Pannonibacter. Metagenomic analysis of arsC, aioA, and arrA sequences confirmed the presence of these genes, with arrA sequences being more closely related to uncultured organisms. Evolutionary analyses revealed high genetic similarity between some arsC and aioA sequences obtained from isolates and clone libraries, suggesting that those isolates may represent environmentally important bacteria acting in As speciation. In addition, our findings show that the diversity of arrA genes is wider than earlier described, once none arrA-OTUs were affiliated with known reference strains. Therefore, the molecular diversity of arrA genes is far from being fully explored deserving further attention. PMID:24755825

  6. Local and global responses in complex gene regulation networks

    NASA Astrophysics Data System (ADS)

    Tsuchiya, Masa; Selvarajoo, Kumar; Piras, Vincent; Tomita, Masaru; Giuliani, Alessandro

    2009-04-01

    An exacerbated sensitivity to apparently minor stimuli and a general resilience of the entire system stay together side-by-side in biological systems. This apparent paradox can be explained by the consideration of biological systems as very strongly interconnected network systems. Some nodes of these networks, thanks to their peculiar location in the network architecture, are responsible for the sensitivity aspects, while the large degree of interconnection is at the basis of the resilience properties of the system. One relevant feature of the high degree of connectivity of gene regulation networks is the emergence of collective ordered phenomena influencing the entire genome and not only a specific portion of transcripts. The great majority of existing gene regulation models give the impression of purely local ‘hard-wired’ mechanisms disregarding the emergence of global ordered behavior encompassing thousands of genes while the general, genome wide, aspects are less known. Here we address, on a data analysis perspective, the discrimination between local and global scale regulations, this goal was achieved by means of the examination of two biological systems: innate immune response in macrophages and oscillating growth dynamics in yeast. Our aim was to reconcile the ‘hard-wired’ local view of gene regulation with a global continuous and scalable one borrowed from statistical physics. This reconciliation is based on the network paradigm in which the local ‘hard-wired’ activities correspond to the activation of specific crucial nodes in the regulation network, while the scalable continuous responses can be equated to the collective oscillations of the network after a perturbation.

  7. Mining for Candidate Genes Related to Pancreatic Cancer Using Protein-Protein Interactions and a Shortest Path Approach

    PubMed Central

    Yuan, Fei; Zhang, Yu-Hang; Wan, Sibao; Wang, ShaoPeng; Kong, Xiang-Yin

    2015-01-01

    Pancreatic cancer (PC) is a highly malignant tumor derived from pancreas tissue and is one of the leading causes of death from cancer. Its molecular mechanism has been partially revealed by validating its oncogenes and tumor suppressor genes; however, the available data remain insufficient for medical workers to design effective treatments. Large-scale identification of PC-related genes can promote studies on PC. In this study, we propose a computational method for mining new candidate PC-related genes. A large network was constructed using protein-protein interaction information, and a shortest path approach was applied to mine new candidate genes based on validated PC-related genes. In addition, a permutation test was adopted to further select key candidate genes. Finally, for all discovered candidate genes, the likelihood that the genes are novel PC-related genes is discussed based on their currently known functions. PMID:26613085

  8. SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data

    PubMed Central

    2011-01-01

    Background The popularity of massively parallel exome and transcriptome sequencing projects demands new data mining tools with a comprehensive set of features to support a wide range of analysis tasks. Results SeqGene, a new data mining tool, supports mutation detection and annotation, dbSNP and 1000 Genome data integration, RNA-Seq expression quantification, mutation and coverage visualization, allele specific expression (ASE), differentially expressed genes (DEGs) identification, copy number variation (CNV) analysis, and gene expression quantitative trait loci (eQTLs) detection. We also developed novel methods for testing the association between SNP and expression and identifying genotype-controlled DEGs. We showed that the results generated from SeqGene compares favourably to other existing methods in our case studies. Conclusion SeqGene is designed as a general-purpose software package. It supports both paired-end reads and single reads generated on most sequencing platforms; it runs on all major types of computers; it supports arbitrary genome assemblies for arbitrary organisms; and it scales well to support both large and small scale sequencing projects. The software homepage is http://seqgene.sourceforge.net. PMID:21714929

  9. RESEARCH PAPERS : Ionospheric signature of surface mine blasts from Global Positioning System measurements

    NASA Astrophysics Data System (ADS)

    Calais, Eric; Bernard Minster, J.; Hofton, Michelle; Hedlin, Michael

    1998-01-01

    Sources such as atmospheric or buried explosions and shallow earthquakes are known to produce infrasonic pressure waves in the atmosphere Because of the coupling between neutral particles and electrons at ionospheric altitudes, these acoustic and gravity waves induce variations of the ionospheric electron density. The Global Positioning System (GPS) provides a way of directly measuring the total electron content in the ionosphere and, therefore, of detecting such perturbations in the upper atmosphere. In July and August 1996, three large surface mine blasts (1.5 Kt each) were detonated at the Black Thunder coal mine in eastern Wyoming. As part of a seismic and acoustic monitoring experiment, we deployed five dual-frequency GPS receivers at distances ranging from 50 to 200 km from the mine and were able to detect the ionospheric perturbation caused by the blasts. The perturbation starts 10 to 15 min after the blast, lasts for about 30 min, and propagates with an apparent horizontal velocity of 1200 m s- 1. Its amplitude reaches 3 × 1014 el m- 2 in the 7-3 min period band, a value close to the ionospheric perturbation caused by the M=6.7 Northridge earthquake (Calais & Minster 1995). The small signal-to-noise ratio of the perturbation can be improved by slant-stacking the electron content time-series recorded by the different GPS receivers taking into account the horizontal propagation of the perturbation. The energy of the perturbation is concentrated in the 200 to 300 s period band, a result consistent with previous observations and numerical model predictions. The 300 s band probably corresponds to gravity modes and shorter periods to acoustic modes, respectively. Using a 1-D stratified velocity model of the atmosphere we show that linear acoustic ray tracing fits arrival times at all GPS receivers. We interpret the perturbation as a direct acoustic wave caused by the explosion itself. This study shows that even relatively small subsurface events can produce

  10. Ionospheric Signature of Surface Mine Blasts from Global Positioning System Measurements

    NASA Technical Reports Server (NTRS)

    Calais, Eric; Minster, J. Bernard; Hofton, Michelle A.; Hedlin, Michael A. H.

    1998-01-01

    Sources such as atmospheric or buried explosions and shallow earthquakes are known to produce infrasonic pressure waves in the atmosphere. Because of the coupling between neutral particles and electrons at ionospheric altitudes, these acoustic and gravity waves induce variations of the ionospheric electron density. The Global Positioning System (GPS) provides a way of directly measuring the total electron content in the ionosphere and, therefore, of detecting such perturbations in the upper atmosphere. In July and August 1996, three large surface mine blasts (1.5 Kt each) were detonated at the Black Thunder coal mine in eastern Wyoming. As part of a seismic and acoustic monitoring- experiment, we deployed five dual-frequency GPS receivers at distances ranging from 50 to 200 km from the mine and were able to detect the ionospheric perturbation caused by the blasts. The perturbation starts 10 to 15 min after the blast, lasts for about 30 min, and propagates with an apparent horizontal velocity of 1200 meters per second. Its amplitude reaches 3 x 10 (exp 14) el per square meters in the 7-3 min period band, a value close to the ionospheric perturbation caused by the M = 6.7 Northridge earthquake. The small signal-to-noise ratio of the perturbation can be improved by slant-stacking the electron content time-series recorded by the different GPS receivers taking into account the horizontal propagation of the perturbation. The energy of the perturbation is concentrated in the 200 to 300 second period band, a result consistent with previous observations and numerical model predictions. The 300 second band probably corresponds to gravity modes and shorter periods to acoustic modes, respectively. Using a 1-D stratified velocity model of the atmosphere we show that linear acoustic ray tracing fits arrival times at all GPS receivers. We interpret the perturbation as a direct acoustic wave caused by the explosion itself. This study shows that even relatively small subsurface

  11. Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets

    PubMed Central

    2014-01-01

    Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624

  12. Regulation of global gene expression and cell proliferation by APP.

    PubMed

    Wu, Yili; Zhang, Si; Xu, Qin; Zou, Haiyan; Zhou, Weihui; Cai, Fang; Li, Tingyu; Song, Weihong

    2016-01-01

    Down syndrome (DS), caused by trisomy of chromosome 21, is one of the most common genetic disorders. Patients with DS display growth retardation and inevitably develop characteristic Alzheimer's disease (AD) neuropathology, including neurofibrillary tangles and neuritic plaques. The expression of amyloid precursor protein (APP) is increased in both DS and AD patients. To reveal the function of APP and elucidate the pathogenic role of increased APP expression in DS and AD, we performed gene expression profiling using microarray method in human cells overexpressing APP. A set of genes are significantly altered, which are involved in cell cycle, cell proliferation and p53 signaling. We found that overexpression of APP inhibits cell proliferation. Furthermore, we confirmed that the downregulation of two validated genes, PSMA5 and PSMB7, inhibits cell proliferation, suggesting that the downregulation of PSMA5 and PSMB7 is involved in APP-induced cell proliferation impairment. Taken together, this study suggests that APP regulates global gene expression and increased APP expression inhibits cell proliferation. Our study provides a novel insight that APP overexpression may contribute to the growth impairment in DS patients and promote AD pathogenesis by inhibiting cell proliferation including neural stem cell proliferation and neurogenesis. PMID:26936520

  13. Regulation of global gene expression and cell proliferation by APP

    PubMed Central

    Wu, Yili; Zhang, Si; Xu, Qin; Zou, Haiyan; Zhou, Weihui; Cai, Fang; Li, Tingyu; Song, Weihong

    2016-01-01

    Down syndrome (DS), caused by trisomy of chromosome 21, is one of the most common genetic disorders. Patients with DS display growth retardation and inevitably develop characteristic Alzheimer’s disease (AD) neuropathology, including neurofibrillary tangles and neuritic plaques. The expression of amyloid precursor protein (APP) is increased in both DS and AD patients. To reveal the function of APP and elucidate the pathogenic role of increased APP expression in DS and AD, we performed gene expression profiling using microarray method in human cells overexpressing APP. A set of genes are significantly altered, which are involved in cell cycle, cell proliferation and p53 signaling. We found that overexpression of APP inhibits cell proliferation. Furthermore, we confirmed that the downregulation of two validated genes, PSMA5 and PSMB7, inhibits cell proliferation, suggesting that the downregulation of PSMA5 and PSMB7 is involved in APP-induced cell proliferation impairment. Taken together, this study suggests that APP regulates global gene expression and increased APP expression inhibits cell proliferation. Our study provides a novel insight that APP overexpression may contribute to the growth impairment in DS patients and promote AD pathogenesis by inhibiting cell proliferation including neural stem cell proliferation and neurogenesis. PMID:26936520

  14. Mining the transcriptomes of four commercially important shellfish species for single nucleotide polymorphisms within biomineralization genes.

    PubMed

    Vendrami, David L J; Shah, Abhijeet; Telesca, Luca; Hoffman, Joseph I

    2016-06-01

    Transcriptional profiling not only provides insights into patterns of gene expression, but also generates sequences that can be mined for molecular markers, which in turn can be used for population genetic studies. As part of a large-scale effort to better understand how commercially important European shellfish species may respond to ocean acidification, we therefore mined the transcriptomes of four species (the Pacific oyster Crassostrea gigas, the blue mussel Mytilus edulis, the great scallop Pecten maximus and the blunt gaper Mya truncata) for single nucleotide polymorphisms (SNPs). Illumina data for C. gigas, M. edulis and P. maximus and 454 data for M. truncata were interrogated using GATK and SWAP454 respectively to identify between 8267 and 47,159 high quality SNPs per species (total=121,053 SNPs residing within 34,716 different contigs). We then annotated the transcripts containing SNPs to reveal homology to diverse genes. Finally, as oceanic pH affects the ability of organisms to incorporate calcium carbonate, we honed in on genes implicated in the biomineralization process to identify a total of 1899 SNPs in 157 genes. These provide good candidates for biomarkers with which to study patterns of selection in natural or experimental populations. PMID:26806806

  15. Mining of vaccine-associated IFN-γ gene interaction networks using the Vaccine Ontology

    PubMed Central

    2011-01-01

    Background Interferon-gamma (IFN-γ) is vital in vaccine-induced immune defense against bacterial and viral infections and tumor. Our recent study demonstrated the power of a literature-based discovery method in extraction and comparison of the IFN-γ and vaccine-mediated gene interaction networks. The Vaccine Ontology (VO) contains a hierarchy of vaccine names. It is hypothesized that the application of VO will enhance the prediction of IFN-γ and vaccine-mediated gene interaction network. Results In this study, 186 specific vaccine names listed in the Vaccine Ontology (VO) and their semantic relations were used for possible improved retrieval of the IFN-γ and vaccine associated gene interactions. The application of VO allows discovery of 38 more genes and 60 more interactions. Comparison of different layers of IFN-γ networks and the example BCG vaccine-induced subnetwork led to generation of new hypotheses. By analyzing all discovered genes using centrality metrics, 32 genes were ranked high in the VO-based IFN-γ vaccine network using four centrality scores. Furthermore, 28 specific vaccines were found to be associated with these top 32 genes. These specific vaccine-gene associations were further used to generate a network of vaccine-vaccine associations. The BCG and LVS vaccines are found to be the most central vaccines in the vaccine-vaccine association network. Conclusion Our results demonstrate that the combined usages of biomedical ontologies and centrality-based literature mining are able to significantly facilitate discovery of gene interaction networks and gene-concept associations. Availability VO is available at: http://www.violinet.org/vaccineontology; and the SVM edit kernel for gene interaction extraction is available at: http://www.violinet.org/ifngvonet/int_ext_svm.zip PMID:21624163

  16. Topological origin of global attractors in gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhang, YunJun; Ouyang, Qi; Geng, Zhi

    2015-02-01

    Fixed-point attractors with global stability manifest themselves in a number of gene regulatory networks. This property indicates the stability of regulatory networks against small state perturbations and is closely related to other complex dynamics. In this paper, we aim to reveal the core modules in regulatory networks that determine their global attractors and the relationship between these core modules and other motifs. This work has been done via three steps. Firstly, inspired by the signal transmission in the regulation process, we extract the model of chain-like network from regulation networks. We propose a module of "ideal transmission chain (ITC)", which is proved sufficient and necessary (under certain condition) to form a global fixed-point in the context of chain-like network. Secondly, by examining two well-studied regulatory networks (i.e., the cell-cycle regulatory networks of Budding yeast and Fission yeast), we identify the ideal modules in true regulation networks and demonstrate that the modules have a superior contribution to network stability (quantified by the relative size of the biggest attraction basin). Thirdly, in these two regulation networks, we find that the double negative feedback loops, which are the key motifs of forming bistability in regulation, are connected to these core modules with high network stability. These results have shed new light on the connection between the topological feature and the dynamic property of regulatory networks.

  17. Distributed Function Mining for Gene Expression Programming Based on Fast Reduction.

    PubMed

    Deng, Song; Yue, Dong; Yang, Le-chan; Fu, Xiong; Feng, Ya-zhou

    2016-01-01

    For high-dimensional and massive data sets, traditional centralized gene expression programming (GEP) or improved algorithms lead to increased run-time and decreased prediction accuracy. To solve this problem, this paper proposes a new improved algorithm called distributed function mining for gene expression programming based on fast reduction (DFMGEP-FR). In DFMGEP-FR, fast attribution reduction in binary search algorithms (FAR-BSA) is proposed to quickly find the optimal attribution set, and the function consistency replacement algorithm is given to solve integration of the local function model. Thorough comparative experiments for DFMGEP-FR, centralized GEP and the parallel gene expression programming algorithm based on simulated annealing (parallel GEPSA) are included in this paper. For the waveform, mushroom, connect-4 and musk datasets, the comparative results show that the average time-consumption of DFMGEP-FR drops by 89.09%%, 88.85%, 85.79% and 93.06%, respectively, in contrast to centralized GEP and by 12.5%, 8.42%, 9.62% and 13.75%, respectively, compared with parallel GEPSA. Six well-studied UCI test data sets demonstrate the efficiency and capability of our proposed DFMGEP-FR algorithm for distributed function mining. PMID:26751200

  18. Stress-Survival Gene Identification From an Acid Mine Drainage Algal Mat Community

    NASA Astrophysics Data System (ADS)

    Urbina-Navarrete, J.; Fujishima, K.; Paulino-Lima, I. G.; Rothschild-Mancinelli, B.; Rothschild, L. J.

    2014-12-01

    Microbial communities from acid mine drainage environments are exposed to multiple stressors to include low pH, high dissolved metal loads, seasonal freezing, and desiccation. The microbial and algal communities that inhabit these niche environments have evolved strategies that allow for their ecological success. Metagenomic analyses are useful in identifying species diversity, however they do not elucidate the mechanisms that allow for the resilience of a community under these extreme conditions. Many known or predicted genes encode for protein products that are unknown, or similarly, many proteins cannot be traced to their gene of origin. This investigation seeks to identify genes that are active in an algal consortium during stress from living in an acid mine drainage environment. Our approach involves using the entire community transcriptome for a functional screen in an Escherichia coli host. This approach directly targets the genes involved in survival, without need for characterizing the members of the consortium.The consortium was harvested and stressed with conditions similar to the native environment it was collected from. Exposure to low pH (< 3.2), high metal load, desiccation, and deep freeze resulted in the expression of stress-induced genes that were transcribed into messenger RNA (mRNA). These mRNA transcripts were harvested to build complementary DNA (cDNA) libraries in E. coli. The transformed E. coli were exposed to the same stressors as the original algal consortium to select for surviving cells. Successful cells incorporated the transcripts that encode survival mechanisms, thus allowing for selection and identification of the gene(s) involved. Initial selection screens for freeze and desiccation tolerance have yielded E. coli that are 1 order of magnitude more resistant to freezing (0.01% survival of control with no transcript, 0.2% survival of E. coli with transcript) and 3 orders of magnitude more resistant to desiccation (0.005% survival of

  19. PoplarGene: poplar gene network and resource for mining functional information for genes from woody plants

    PubMed Central

    Liu, Qi; Ding, Changjun; Chu, Yanguang; Chen, Jiafei; Zhang, Weixi; Zhang, Bingyu; Huang, Qinjun; Su, Xiaohua

    2016-01-01

    Poplar is not only an important resource for the production of paper, timber and other wood-based products, but it has also emerged as an ideal model system for studying woody plants. To better understand the biological processes underlying various traits in poplar, e.g., wood development, a comprehensive functional gene interaction network is highly needed. Here, we constructed a genome-wide functional gene network for poplar (covering ~70% of the 41,335 poplar genes) and created the network web service PoplarGene, offering comprehensive functional interactions and extensive poplar gene functional annotations. PoplarGene incorporates two network-based gene prioritization algorithms, neighborhood-based prioritization and context-based prioritization, which can be used to perform gene prioritization in a complementary manner. Furthermore, the co-functional information in PoplarGene can be applied to other woody plant proteomes with high efficiency via orthology transfer. In addition to poplar gene sequences, the webserver also accepts Arabidopsis reference gene as input to guide the search for novel candidate functional genes in PoplarGene. We believe that PoplarGene (http://bioinformatics.caf.ac.cn/PoplarGene and http://124.127.201.25/PoplarGene) will greatly benefit the research community, facilitating studies of poplar and other woody plants. PMID:27515999

  20. PoplarGene: poplar gene network and resource for mining functional information for genes from woody plants.

    PubMed

    Liu, Qi; Ding, Changjun; Chu, Yanguang; Chen, Jiafei; Zhang, Weixi; Zhang, Bingyu; Huang, Qinjun; Su, Xiaohua

    2016-01-01

    Poplar is not only an important resource for the production of paper, timber and other wood-based products, but it has also emerged as an ideal model system for studying woody plants. To better understand the biological processes underlying various traits in poplar, e.g., wood development, a comprehensive functional gene interaction network is highly needed. Here, we constructed a genome-wide functional gene network for poplar (covering ~70% of the 41,335 poplar genes) and created the network web service PoplarGene, offering comprehensive functional interactions and extensive poplar gene functional annotations. PoplarGene incorporates two network-based gene prioritization algorithms, neighborhood-based prioritization and context-based prioritization, which can be used to perform gene prioritization in a complementary manner. Furthermore, the co-functional information in PoplarGene can be applied to other woody plant proteomes with high efficiency via orthology transfer. In addition to poplar gene sequences, the webserver also accepts Arabidopsis reference gene as input to guide the search for novel candidate functional genes in PoplarGene. We believe that PoplarGene (http://bioinformatics.caf.ac.cn/PoplarGene and http://124.127.201.25/PoplarGene) will greatly benefit the research community, facilitating studies of poplar and other woody plants. PMID:27515999

  1. Data Mining for Global Change: A Vision for "Big Data" in the Earth Sciences

    NASA Astrophysics Data System (ADS)

    Steinhaeuser, K.

    2012-12-01

    Over the past several decades, the Earth sciences have undergone a rapid transformation from a historically data-poor to a relatively data-rich environment. This development is largely due to significant improvements in observation technologies (notably satellites since the 1970s) on one hand, and advances in computational tools (both hardware and software) on the other. As a result the Earth sciences are primed to enter the Fourth Paradigm, a term coined by the late Jim Gray to describe a new realm of scientific discovery driven by data analysis - the other three being theory, experimentation, and computer simulation. In particular, observations from remote sensors on satellites and weather radars, in situ sensors and sensor networks, along with outputs of global climate or Earth system models from large-scale simulations as well as regional modeling studies, produce data approaching the Tera- and Petabyte scales. These massive and information-rich datasets offer a significant opportunity for advancing our understanding of the global climate system and in turn our ability to make better informed projections of future climate change, yet current data analysis techniques are not able to realize their full potential. We will outline a vision for the application of "Big Data" tools and technologies in the Earth sciences, which have the potential to make a transformative impact on the toolbox available to the scientist as well as the way science is conducted. For instance, data mining and machine learning could provide novel computational tools that empower scientists to perform analyses more efficiently and effectively than ever before: tedious routine tasks become automated, existing methods scale to significantly larger datasets, and innovative methods may provide new capabilities altogether. Most notably we are not interested in leveraging computation for simulations of increasing scale or resolution but rather in the analysis of datasets of increasing size and

  2. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses.

    PubMed

    Stelzer, Gil; Rosen, Naomi; Plaschkes, Inbar; Zimmerman, Shahar; Twik, Michal; Fishilevich, Simon; Stein, Tsippi Iny; Nudel, Ron; Lieder, Iris; Mazor, Yaron; Kaplan, Sergey; Dahary, Dvir; Warshawsky, David; Guan-Golan, Yaron; Kohn, Asher; Rappaport, Noa; Safran, Marilyn; Lancet, Doron

    2016-01-01

    GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc. PMID:27322403

  3. Global gene expression response to telomerase in bovine adrenocortical cells

    SciTech Connect

    Perrault, Steven D.; Hornsby, Peter J.; Betts, Dean H. . E-mail: bettsd@uoguelph.ca

    2005-09-30

    The infinite proliferative capability of most immortalized cells is dependent upon the presence of the enzyme telomerase and its ability to maintain telomere length and structure. However, telomerase may be involved in a greater system than telomere length regulation, as recent evidence has shown it capable of increasing wound healing in vivo, and improving cellular proliferation rate and survival from apoptosis in vitro. Here, we describe the global gene expression response to ectopic telomerase expression in an in vitro bovine adrenocortical cell model. Telomerase-immortalized cells showed an increased ability for proliferation and survival in minimal essential medium above cells transgenic for GFP. cDNA microarray analyses revealed an altered cell state indicative of increased adrenocortical cell proliferation regulated by the IGF2 pathway and alterations in members of the TGF-B family. As well, we identified alterations in genes associated with development and wound healing that support a model that high telomerase expression induces a highly adaptable, progenitor-like state.

  4. Differential global gene expression in red and white skeletal muscle

    NASA Technical Reports Server (NTRS)

    Campbell, W. G.; Gordon, S. E.; Carlson, C. J.; Pattison, J. S.; Hamilton, M. T.; Booth, F. W.

    2001-01-01

    The differences in gene expression among the fiber types of skeletal muscle have long fascinated scientists, but for the most part, previous experiments have only reported differences of one or two genes at a time. The evolving technology of global mRNA expression analysis was employed to determine the potential differential expression of approximately 3,000 mRNAs between the white quad (white muscle) and the red soleus muscle (mixed red muscle) of female ICR mice (30-35 g). Microarray analysis identified 49 mRNA sequences that were differentially expressed between white and mixed red skeletal muscle, including newly identified differential expressions between muscle types. For example, the current findings increase the number of known, differentially expressed mRNAs for transcription factors/coregulators by nine and signaling proteins by three. The expanding knowledge of the diversity of mRNA expression between white and mixed red muscle suggests that there could be quite a complex regulation of phenotype between muscles of different fiber types.

  5. Differential gene expression in Iberian green frogs (Pelophylax perezi) inhabiting a deactivated uranium mine.

    PubMed

    Marques, Sérgio M; Chaves, Sandra; Gonçalves, Fernando; Pereira, Ruth

    2013-01-01

    Iberian green frogs (Pelophylax perezi) were found inhabiting a deactivated uranium mine, especially an effluent pond, seriously contaminated with metals and radionuclides. These animals were previously assessed for oxidative stress parameters and did not revealed significant alterations. In order to better understand which mechanisms may be involved in the ability to withstand permanent contamination gene expression analysis was performed in the liver, through suppression subtractive hybridization (SSH). The SSH outcome in the liver revealed the up-regulation of genes coding for the ribosomal protein L7a and for several proteins typical from blood plasma: fibrinogen, hemoglobin and albumin. Besides their normal function, some of these proteins can play an important role as protective agents against oxidative stress. This work provides new insights on possible basal protection mechanisms that may act in organisms exposed chronically to contamination. PMID:23146668

  6. Bioactivity-guided genome mining reveals the lomaiviticin biosynthetic gene cluster in Salinispora tropica

    PubMed Central

    Kersten, Roland D.; Lane, Amy L.; Nett, Markus; Richter, Taylor K. S.; Duggan, Brendan M.; Dorrestein, Pieter C.

    2013-01-01

    The use of genome sequences has become routine in guiding the discovery and identification of microbial natural products and their biosynthetic pathways. In silico prediction of molecular features, such as metabolic building blocks, physico-chemical properties or biological functions, from orphan gene clusters has opened up the characterization of many new chemo- and genotypes in genome mining approaches. Here, we guided our genome mining of two predicted enediyne pathways in Salinispora tropica CNB-440 by a DNA interference bioassay to isolate DNA-targeting enediyne polyketides. An organic extract of S. tropica showed DNA-interference activity that surprisingly was not abolished in genetic mutants of the targeted enediyne pathways, ST_pks1 and spo. Instead we showed that the product of the orphan type II polyketide synthase pathway, ST_pks2, is solely responsible for the DNA-interfering activity of the parent strain. Subsequent comparative metabolic profiling revealed the lomaiviticins, glycosylated diazofluorene polyketides, as the ST_pks2 products. This study marks the first report of the 59 open reading frame lomaiviticin gene cluster (lom) and supports the biochemical logic of their dimeric construction via a pathway related to the kinamycin monomer. PMID:23649992

  7. Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm

    PubMed Central

    2012-01-01

    Background Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space. Results We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples. Conclusions Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data. PMID:22475802

  8. Analyzing Large Gene Expression and Methylation Data Profiles Using StatBicRM: Statistical Biclustering-Based Rule Mining

    PubMed Central

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data

  9. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    PubMed

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data

  10. Mining royalties: a global study of their impact on investors, government and civil society

    SciTech Connect

    Otto James

    2006-08-15

    The book discusses the history of royalties and the types currently in use, covering issues such as tax administration, revenue distribution and reporting. It identifies the strengths and weaknesses of various royalty approaches and their impact on production decisions and mine economics. A section on governance looks at the management of mining revenue by governments and the need for transparency. There is an attached CD with 4 appendixes with examples of royalty legislation from over 40 countries. 10 figs., 40 tabs., 4 apps.

  11. Variations in the progranulin gene affect global gene expression in frontotemporal lobar degeneration.

    PubMed

    Chen-Plotkin, Alice S; Geser, Felix; Plotkin, Joshua B; Clark, Chris M; Kwong, Linda K; Yuan, Wuxing; Grossman, Murray; Van Deerlin, Vivianna M; Trojanowski, John Q; Lee, Virginia M-Y

    2008-05-15

    Frontotemporal lobar degeneration is a fatal neurodegenerative disease that results in progressive decline in behavior, executive function and sometimes language. Disease mechanisms remain poorly understood. Recently, however, the DNA- and RNA-binding protein TDP-43 has been identified as the major protein present in the hallmark inclusion bodies of frontotemporal lobar degeneration with ubiquitinated inclusions (FTLD-U), suggesting a role for transcriptional dysregulation in FTLD-U pathophysiology. Using the Affymetrix U133A microarray platform, we profiled global gene expression in both histopathologically affected and unaffected areas of human FTLD-U brains. We then characterized differential gene expression with biological pathway analyses, cluster and principal component analyses, and subgroup analyses based on brain region and progranulin (GRN) gene status. Comparing 17 FTLD-U brains to 11 controls, we identified 414 upregulated and 210 downregulated genes in frontal cortex (P-value < 0.001). Moreover, cluster and principal component analyses revealed that samples with mutations or possibly pathogenic variations in the GRN gene (GRN+, 7/17) had an expression signature that was distinct from both normal controls and FTLD-U samples lacking GRN gene variations (GRN-, 10/17). Within the subgroup of GRN+ FTLD-U, we found >1300 dysregulated genes in frontal cortex (P-value < 0.001), many participating in pathways uniquely dysregulated in the GRN+ cases. Our findings demonstrate a distinct molecular phenotype for GRN+ FTLD-U, not readily apparent on clinical or histopathological examination, suggesting distinct pathophysiological mechanisms for GRN+ and GRN- subtypes of FTLD-U. In addition, these data from a large number of human brains provide a valuable resource for future testing of disease hypotheses. PMID:18223198

  12. Mining cancer gene expression databases for latent information on intronic microRNAs.

    PubMed

    Monterisi, Simona; D'Ario, Giovanni; Dama, Elisa; Rotmensz, Nicole; Confalonieri, Stefano; Tordonato, Chiara; Troglio, Flavia; Bertalot, Giovanni; Maisonneuve, Patrick; Viale, Giuseppe; Nicassio, Francesco; Vecchi, Manuela; Di Fiore, Pier Paolo; Bianchi, Fabrizio

    2015-02-01

    Around 50% of all human microRNAs reside within introns of coding genes and are usually co-transcribed. Gene expression datasets, therefore, should contain a wealth of miRNA-relevant latent information, exploitable for many basic and translational research aims. The present study was undertaken to investigate this possibility. We developed an in silico approach to identify intronic-miRNAs relevant to breast cancer, using public gene expression datasets. This led to the identification of a miRNA signature for aggressive breast cancer, and to the characterization of novel roles of selected miRNAs in cancer-related biological phenotypes. Unexpectedly, in a number of cases, expression regulation of the intronic-miRNA was more relevant than the expression of their host gene. These results provide a proof of principle for the validity of our intronic miRNA mining strategy, which we envision can be applied not only to cancer research, but also to other biological and biomedical fields. PMID:25459350

  13. The future of Yellowcake: a global assessment of uranium resources and mining.

    PubMed

    Mudd, Gavin M

    2014-02-15

    Uranium (U) mining remains controversial in many parts of the world, especially in a post-Fukushima context, and often in areas with significant U resources. Although nuclear proponents point to the relatively low carbon intensity of nuclear power compared to fossil fuels, opponents argue that this will be eroded in the future as ore grades decline and energy and greenhouse gas emissions (GGEs) intensity increases as a result. Invariably both sides fail to make use of the increasingly available data reported by some U mines through sustainability reporting - allowing a comprehensive assessment of recent trends in the energy and GGE intensity of U production, as well as combining this with reported mineral resources to allow more comprehensive modelling of future energy and GGEs intensity. In this study, detailed data sets are compiled on reported U resources by deposit type, as well as mine production, energy and GGE intensity. Some important aspects included are the relationship between ore grade, deposit type and recovery, which are crucial in future projections of U mining. Overall, the paper demonstrates that there are extensive U resources known to meet potential short to medium term demand, although the future of U mining remains uncertain due to the doubt about the future of nuclear power as well as a range of complex social, environmental, economic and some site-specific technical issues. PMID:24317167

  14. Gene mining in halophytes: functional identification of stress tolerance genes in Lepidium crassifolium.

    PubMed

    Rigó, Gábor; Valkai, Ildikó; Faragó, Dóra; Kiss, Edina; Van Houdt, Sara; Van de Steene, Nancy; Hannah, Matthew A; Szabados, László

    2016-09-01

    Extremophile plants are valuable sources of genes conferring tolerance traits, which can be explored to improve stress tolerance of crops. Lepidium crassifolium is a halophytic relative of the model plant Arabidopsis thaliana, and displays tolerance to salt, osmotic and oxidative stresses. We have employed the modified Conditional cDNA Overexpression System to transfer a cDNA library from L. crassifolium to the glycophyte A. thaliana. By screening for salt, osmotic and oxidative stress tolerance through in vitro growth assays and non-destructive chlorophyll fluorescence imaging, 20 Arabidopsis lines were identified with superior performance under restrictive conditions. Several cDNA inserts were cloned and confirmed to be responsible for the enhanced tolerance by analysing independent transgenic lines. Examples include full-length cDNAs encoding proteins with high homologies to GDSL-lipase/esterase or acyl CoA-binding protein or proteins without known function, which could confer tolerance to one or several stress conditions. Our results confirm that random gene transfer from stress tolerant to sensitive plant species is a valuable tool to discover novel genes with potential for biotechnological applications. PMID:27343166

  15. How to learn about gene function: text-mining or ontologies?

    PubMed

    Soldatos, Theodoros G; Perdigão, Nelson; Brown, Nigel P; Sabir, Kenneth S; O'Donoghue, Seán I

    2015-03-01

    As the amount of genome information increases rapidly, there is a correspondingly greater need for methods that provide accurate and automated annotation of gene function. For example, many high-throughput technologies--e.g., next-generation sequencing--are being used today to generate lists of genes associated with specific conditions. However, their functional interpretation remains a challenge and many tools exist trying to characterize the function of gene-lists. Such systems rely typically in enrichment analysis and aim to give a quick insight into the underlying biology by presenting it in a form of a summary-report. While the load of annotation may be alleviated by such computational approaches, the main challenge in modern annotation remains to develop a systems form of analysis in which a pipeline can effectively analyze gene-lists quickly and identify aggregated annotations through computerized resources. In this article we survey some of the many such tools and methods that have been developed to automatically interpret the biological functions underlying gene-lists. We overview current functional annotation aspects from the perspective of their epistemology (i.e., the underlying theories used to organize information about gene function into a body of verified and documented knowledge) and find that most of the currently used functional annotation methods fall broadly into one of two categories: they are based either on 'known' formally-structured ontology annotations created by 'experts' (e.g., the GO terms used to describe the function of Entrez Gene entries), or--perhaps more adventurously--on annotations inferred from literature (e.g., many text-mining methods use computer-aided reasoning to acquire knowledge represented in natural languages). Overall however, deriving detailed and accurate insight from such gene lists remains a challenging task, and improved methods are called for. In particular, future methods need to (1) provide more holistic

  16. Phylogenomic study of lipid genes involved in microalgal biofuel production-candidate gene mining and metabolic pathway analyses.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

    2012-01-01

    Optimizing microalgal biofuel production using metabolic engineering tools requires an in-depth understanding of the structure-function relationship of genes involved in lipid biosynthetic pathway. In the present study, genome-wide identification and characterization of 398 putative genes involved in lipid biosynthesis in Arabidopsis thaliana Chlamydomonas reinhardtii, Volvox carteri, Ostreococcus lucimarinus, Ostreococcus tauri and Cyanidioschyzon merolae was undertaken on the basis of their conserved motif/domain organization and phylogenetic profile. The results indicated that the core lipid metabolic pathways in all the species are carried out by a comparable number of orthologous proteins. Although the fundamental gene organizations were observed to be invariantly conserved between microalgae and Arabidopsis genome, with increased order of genome complexity there seems to be an association with more number of genes involved in triacylglycerol (TAG) biosynthesis and catabolism. Further, phylogenomic analysis of the genes provided insights into the molecular evolution of lipid biosynthetic pathway in microalgae and confirm the close evolutionary proximity between the Streptophyte and Chlorophyte lineages. Together, these studies will improve our understanding of the global lipid metabolic pathway and contribute to the engineering of regulatory networks of algal strains for higher accumulation of oil. PMID:23032611

  17. Novel nickel resistance genes from the rhizosphere metagenome of plants adapted to acid mine drainage.

    PubMed

    Mirete, Salvador; de Figueras, Carolina G; González-Pastor, Jose E

    2007-10-01

    Metal resistance determinants have traditionally been found in cultivated bacteria. To search for genes involved in nickel resistance, we analyzed the bacterial community of the rhizosphere of Erica andevalensis, an endemic heather which grows at the banks of the Tinto River, a naturally metal-enriched and extremely acidic environment in southwestern Spain. 16S rRNA gene sequence analysis of rhizosphere DNA revealed the presence of members of five phylogenetic groups of Bacteria and the two main groups of Archaea mostly associated with sites impacted by acid mine drainage (AMD). The diversity observed and the presence of heavy metals in the rhizosphere led us to construct and screen five different metagenomic libraries hosted in Escherichia coli for searching novel nickel resistance determinants. A total of 13 positive clones were detected and analyzed. Insights about their possible mechanisms of resistance were obtained from cellular nickel content and sequence similarities. Two clones encoded putative ABC transporter components, and a novel mechanism of metal efflux is suggested. In addition, a nickel hyperaccumulation mechanism is proposed for a clone encoding a serine O-acetyltransferase. Five clones encoded proteins similar to well-characterized proteins but not previously reported to be related to nickel resistance, and the remaining six clones encoded hypothetical or conserved hypothetical proteins of uncertain functions. This is the first report documenting nickel resistance genes recovered from the metagenome of an AMD environment. PMID:17675438

  18. Phylogenetic Diversity of Archaea and the Archaeal Ammonia Monooxygenase Gene in Uranium Mining-Impacted Locations in Bulgaria

    PubMed Central

    Radeva, Galina; Kenarova, Anelia; Bachvarova, Velina; Popov, Ivan; Selenska-Pobell, Sonja

    2014-01-01

    Uranium mining and milling activities adversely affect the microbial populations of impacted sites. The negative effects of uranium on soil bacteria and fungi are well studied, but little is known about the effects of radionuclides and heavy metals on archaea. The composition and diversity of archaeal communities inhabiting the waste pile of the Sliven uranium mine and the soil of the Buhovo uranium mine were investigated using 16S rRNA gene retrieval. A total of 355 archaeal clones were selected, and their 16S rDNA inserts were analysed by restriction fragment length polymorphism (RFLP) discriminating 14 different RFLP types. All evaluated archaeal 16S rRNA gene sequences belong to the 1.1b/Nitrososphaera cluster of Crenarchaeota. The composition of the archaeal community is distinct for each site of interest and dependent on environmental characteristics, including pollution levels. Since the members of 1.1b/Nitrososphaera cluster have been implicated in the nitrogen cycle, the archaeal communities from these sites were probed for the presence of the ammonia monooxygenase gene (amoA). Our data indicate that amoA gene sequences are distributed in a similar manner as in Crenarchaeota, suggesting that archaeal nitrification processes in uranium mining-impacted locations are under the control of the same key factors controlling archaeal diversity. PMID:24711725

  19. Genes Involved in the Evolution of Herbivory by a Leaf-Mining, Drosophilid Fly

    PubMed Central

    Whiteman, Noah K.; Gloss, Andrew D.; Sackton, Timothy B.; Groen, Simon C.; Humphrey, Parris T.; Lapoint, Richard T.; Sønderby, Ida E.; Halkier, Barbara A.; Kocks, Christine; Ausubel, Frederick M.; Pierce, Naomi E.

    2012-01-01

    Herbivorous insects are among the most successful radiations of life. However, we know little about the processes underpinning the evolution of herbivory. We examined the evolution of herbivory in the fly, Scaptomyza flava, whose larvae are leaf miners on species of Brassicaceae, including the widely studied reference plant, Arabidopsis thaliana (Arabidopsis). Scaptomyza flava is phylogenetically nested within the paraphyletic genus Drosophila, and the whole genome sequences available for 12 species of Drosophila facilitated phylogenetic analysis and assembly of a transcriptome for S. flava. A time-calibrated phylogeny indicated that leaf mining in Scaptomyza evolved between 6 and 16 million years ago. Feeding assays showed that biosynthesis of glucosinolates, the major class of antiherbivore chemical defense compounds in mustard leaves, was upregulated by S. flava larval feeding. The presence of glucosinolates in wild-type (WT) Arabidopsis plants reduced S. flava larval weight gain and increased egg–adult development time relative to flies reared in glucosinolate knockout (GKO) plants. An analysis of gene expression differences in 5-day-old larvae reared on WT versus GKO plants showed a total of 341 transcripts that were differentially regulated by glucosinolate uptake in larval S. flava. Of these, approximately a third corresponded to homologs of Drosophila melanogaster genes associated with starvation, dietary toxin-, heat-, oxidation-, and aging-related stress. The upregulated transcripts exhibited elevated rates of protein evolution compared with unregulated transcripts. The remaining differentially regulated transcripts also contained a higher proportion of novel genes than the unregulated transcripts. Thus, the transition to herbivory in Scaptomyza appears to be coupled with the evolution of novel genes and the co-option of conserved stress-related genes. PMID:22813779

  20. Mining locus tags in PubMed Central to improve microbial gene annotation

    PubMed Central

    2014-01-01

    Background The scientific literature contains millions of microbial gene identifiers within the full text and tables, but these annotations rarely get incorporated into public sequence databases. We propose to utilize the Open Access (OA) subset of PubMed Central (PMC) as a gene annotation database and have developed an R package called pmcXML to automatically mine and extract locus tags from full text, tables and supplements. Results We mined locus tags from 1835 OA publications in ten microbial genomes and extracted tags mentioned in 30,891 sentences in main text and 20,489 rows in tables. We identified locus tag pairs marking the start and end of a region such as an operon or genomic island and expanded these ranges to add another 13,043 tags. We also searched for locus tags in supplementary tables and publications outside the OA subset in Burkholderia pseudomallei K96243 for comparison. There were 168 publications containing 48,470 locus tags and 83% of mentions were from supplementary materials and 9% from publications outside the OA subset. Conclusions B. pseudomallei locus tags within the full text and tables of OA publications represent only a small fraction of the total mentions in the literature. For microbial genomes with very few functionally characterized proteins, the locus tags mentioned in supplementary tables and within ranges like genomic islands contain the majority of locus tags. Significantly, the functions in the R package provide access to additional resources in the OA subset that are not currently indexed or returned by searching PMC. PMID:24499370

  1. Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

    SciTech Connect

    Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali; Tuskan, Gerald A; Kalluri, Udaya C

    2011-01-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.

  2. Mining Genes Involved in Insecticide Resistance of Liposcelis bostrychophila Badonnel by Transcriptome and Expression Profile Analysis

    PubMed Central

    Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun

    2013-01-01

    Background Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. Methodology and Principal Findings In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. Conclusion The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids. PMID:24278202

  3. Cytogenomic mapping and bioinformatic mining reveal interacting brain expressed genes for intellectual disability

    PubMed Central

    2014-01-01

    Background Microarray analysis has been used as the first-tier genetic testing to detect chromosomal imbalances and copy number variants (CNVs) for pediatric patients with intellectual and developmental disabilities (ID/DD). To further investigate the candidate genes and underlying dosage-sensitive mechanisms related to ID, cytogenomic mapping of critical regions and bioinformatic mining of candidate brain-expressed genes (BEGs) and their functional interactions were performed. Critical regions of chromosomal imbalances and pathogenic CNVs were mapped by subtracting known benign CNVs from the Databases of Genomic Variants (DGV) and extracting smallest overlap regions with cases from DatabasE of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources (DECIPHER). BEGs from these critical regions were revealed by functional annotation using Database for Annotation, Visualization, and Integrated Discovery (DAVID) and by tissue expression pattern from Uniprot. Cross-region interrelations and functional networks of the BEGs were analyzed using Gene Relationships Across Implicated Loci (GRAIL) and Ingenuity Pathway Analysis (IPA). Results Of the 1,354 patients analyzed by oligonucleotide array comparative genomic hybridization (aCGH), pathogenic abnormalities were detected in 176 patients including genomic disorders in 66 patients (37.5%), subtelomeric rearrangements in 45 patients (25.6%), interstitial imbalances in 33 patients (18.8%), chromosomal structural rearrangements in 17 patients (9.7%) and aneuploidies in 15 patients (8.5%). Subtractive and extractive mapping defined 82 disjointed critical regions from the detected abnormalities. A total of 461 BEGs was generated from 73 disjointed critical regions. Enrichment of central nervous system specific genes in these regions was noted. The number of BEGs increased with the size of the regions. A list of 108 candidate BEGs with significant cross region interrelation was identified by GRAIL and five

  4. Mining the Present: Reconstructing Progressive Education in an Era of Global Change

    ERIC Educational Resources Information Center

    Edwards, Laura A.; Greenwalt, Kyle A.

    2013-01-01

    This paper explores what might be seen as a paradox at the heart of the current push to "globalize" education: at a moment when administrators, especially in higher education, are seeking to globalize their programs (often for reasons having to do with increasing international competition and decreasing funding for education), global…

  5. Global gene expression profiles in developing soybean seeds.

    PubMed

    Asakura, Tomiko; Tamura, Tomoko; Terauchi, Kaede; Narikawa, Tomoyo; Yagasaki, Kazuhiro; Ishimaru, Yoshiro; Abe, Keiko

    2012-03-01

    The gene expression profiles in soybean (Glycine max L.) seeds at 4 stages of development, namely, pod, 2-mm bean, 5-mm bean, and full-size bean, were examined by DNA microarray analysis. The total genes of each sample were classified into 4 clusters based on stage of development. Gene expression was strictly controlled by seed size, which coincides with the development stage. First, stage specific gene expression was examined. Many transcription factors were expressed in pod, 2-mm bean and 5-mm bean. In contrast, storage proteins were mainly expressed in full-size bean. Next, we extracted the genes that are differentially expressed genes (DEGs) that were extracted using the Rank products method of the Bioconductor software package. These DEGs were sorted into 8 groups using the hclust function according to gene expression patterns. Three of the groups across which the expression levels progressively increased included 100 genes, while 3 groups across which the levels decreased contained 47 genes. Storage proteins, seed-maturation proteins, some protease inhibitors, and the allergen Gly m Bd 28K were classified into the former groups. Lipoxygenase (LOX) family members were present in both the groups, indicating the multi-functionality with different expression patterns. PMID:22245912

  6. The Influence of the Global Gene Expression Shift on Downstream Analyses

    PubMed Central

    Xu, Qifeng; Zhang, Xuegong

    2016-01-01

    The assumption that total abundance of RNAs in a cell is roughly the same in different cells is underlying most studies based on gene expression analyses. But experiments have shown that changes in the expression of some master regulators such as c-MYC can cause global shift in the expression of almost all genes in some cell types like cancers. Such shift will violate this assumption and can cause wrong or biased conclusions for standard data analysis practices, such as detection of differentially expressed (DE) genes and molecular classification of tumors based on gene expression. Most existing gene expression data were generated without considering this possibility, and are therefore at the risk of having produced unreliable results if such global shift effect exists in the data. To evaluate this risk, we conducted a systematic study on the possible influence of the global gene expression shift effect on differential expression analysis and on molecular classification analysis. We collected data with known global shift effect and also generated data to simulate different situations of the effect based on a wide collection of real gene expression data, and conducted comparative studies on representative existing methods. We observed that some DE analysis methods are more tolerant to the global shift while others are very sensitive to it. Classification accuracy is not sensitive to the shift and actually can benefit from it, but genes selected for the classification can be greatly affected. PMID:27092944

  7. Global Gene Expression Analysis of the Zoonotic Parasite Trichinella spiralis Revealed Novel Genes in Host Parasite Interaction

    PubMed Central

    Jiang, Ning; Wang, Jielin; Tang, Bin; Lu, Huijun; Peng, Shuai; Chang, Zhiguang; Tang, Yizhi; Yin, Jigang; Liu, Mingyuan; Tan, Yan; Chen, Qijun

    2012-01-01

    Background Trichinellosis is a typical food-borne zoonotic disease which is epidemic worldwide and the nematode Trichinella spiralis is the main pathogen. The life cycle of T. spiralis contains three developmental stages, i.e. adult worms, new borne larva (new borne L1 larva) and muscular larva (infective L1 larva). Stage-specific gene expression in the parasites has been investigated with various immunological and cDNA cloning approaches, whereas the genome-wide transcriptome and expression features of the parasite have been largely unknown. The availability of the genome sequence information of T. spiralis has made it possible to deeply dissect parasite biology in association with global gene expression and pathogenesis. Methodology and Principal Findings In this study, we analyzed the global gene expression patterns in the three developmental stages of T. spiralis using digital gene expression (DGE) analysis. Almost 15 million sequence tags were generated with the Illumina RNA-seq technology, producing expression data for more than 9,000 genes, covering 65% of the genome. The transcriptome analysis revealed thousands of differentially expressed genes within the genome, and importantly, a panel of genes encoding functional proteins associated with parasite invasion and immuno-modulation were identified. More than 45% of the genes were found to be transcribed from both strands, indicating the importance of RNA-mediated gene regulation in the development of the parasite. Further, based on gene ontological analysis, over 3000 genes were functionally categorized and biological pathways in the three life cycle stage were elucidated. Conclusions and Significance The global transcriptome of T. spiralis in three developmental stages has been profiled, and most gene activity in the genome was found to be developmentally regulated. Many metabolic and biological pathways have been revealed. The findings of the differential expression of several protein families facilitate

  8. Global analysis of patterns of gene expression during Drosophila embryogenesis

    PubMed Central

    Tomancak, Pavel; Berman, Benjamin P; Beaton, Amy; Weiszmann, Richard; Kwan, Elaine; Hartenstein, Volker; Celniker, Susan E; Rubin, Gerald M

    2007-01-01

    Background Cell and tissue specific gene expression is a defining feature of embryonic development in multi-cellular organisms. However, the range of gene expression patterns, the extent of the correlation of expression with function, and the classes of genes whose spatial expression are tightly regulated have been unclear due to the lack of an unbiased, genome-wide survey of gene expression patterns. Results We determined and documented embryonic expression patterns for 6,003 (44%) of the 13,659 protein-coding genes identified in the Drosophila melanogaster genome with over 70,000 images and controlled vocabulary annotations. Individual expression patterns are extraordinarily diverse, but by supplementing qualitative in situ hybridization data with quantitative microarray time-course data using a hybrid clustering strategy, we identify groups of genes with similar expression. Of 4,496 genes with detectable expression in the embryo, 2,549 (57%) fall into 10 clusters representing broad expression patterns. The remaining 1,947 (43%) genes fall into 29 clusters representing restricted expression, 20% patterned as early as blastoderm, with the majority restricted to differentiated cell types, such as epithelia, nervous system, or muscle. We investigate the relationship between expression clusters and known molecular and cellular-physiological functions. Conclusion Nearly 60% of the genes with detectable expression exhibit broad patterns reflecting quantitative rather than qualitative differences between tissues. The other 40% show tissue-restricted expression; the expression patterns of over 1,500 of these genes are documented here for the first time. Within each of these categories, we identified clusters of genes associated with particular cellular and developmental functions. PMID:17645804

  9. Biotic Stress Globally Down-Regulates Photosynthesis Genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Upon herbivore and pathogen attacks, plants switch from processes supporting growth and reproduction to defense by inducing a set of defense genes and down-regulating most of the nuclear encoded photosynthetic genes. To determine if this transcriptional response is universal we used transcriptome da...

  10. Identification of Novel Target Genes for Safer and More Specific Control of Root-Knot Nematodes from a Pan-Genome Mining

    PubMed Central

    Danchin, Etienne G. J.; Perfus-Barbeoch, Laetitia; Magliano, Marc; Rosso, Marie-Noëlle; Da Rocha, Martine; Da Silva, Corinne; Nottet, Nicolas; Labadie, Karine; Guy, Julie; Artiguenave, François; Abad, Pierre

    2013-01-01

    Root-knot nematodes are globally the most aggressive and damaging plant-parasitic nematodes. Chemical nematicides have so far constituted the most efficient control measures against these agricultural pests. Because of their toxicity for the environment and danger for human health, these nematicides have now been banned from use. Consequently, new and more specific control means, safe for the environment and human health, are urgently needed to avoid worldwide proliferation of these devastating plant-parasites. Mining the genomes of root-knot nematodes through an evolutionary and comparative genomics approach, we identified and analyzed 15,952 nematode genes conserved in genomes of plant-damaging species but absent from non target genomes of chordates, plants, annelids, insect pollinators and mollusks. Functional annotation of the corresponding proteins revealed a relative abundance of putative transcription factors in this parasite-specific set compared to whole proteomes of root-knot nematodes. This may point to important and specific regulators of genes involved in parasitism. Because these nematodes are known to secrete effector proteins in planta, essential for parasitism, we searched and identified 993 such effector-like proteins absent from non-target species. Aiming at identifying novel targets for the development of future control methods, we biologically tested the effect of inactivation of the corresponding genes through RNA interference. A total of 15 novel effector-like proteins and one putative transcription factor compatible with the design of siRNAs were present as non-redundant genes and had transcriptional support in the model root-knot nematode Meloidogyne incognita. Infestation assays with siRNA-treated M. incognita on tomato plants showed significant and reproducible reduction of the infestation for 12 of the 16 tested genes compared to control nematodes. These 12 novel genes, showing efficient reduction of parasitism when silenced, constitute

  11. Assessment of Local Biodiversity Loss in Uranium Mining-Tales And Its Projections On Global Scale

    NASA Astrophysics Data System (ADS)

    Sharshenova, D.; Zhamangulova, N.

    2015-12-01

    In Min-Kush, northern Kyrgyzstan there are 8 mining tales with an estimate of 1 961 000 tones of industrial Uranium. Local ecosystem services have declined rapidly. We analyzed a terrestrial assemblage database of Uranium mine-tale to quantify local biodiversity responses to land use and environmental changes. In the worst-affected habitats species richness reduced by 95.7%, total abundance by 60.9% and rarefaction-based richness by 72.5%. We estimate that, regional mountain ecosystem affected by this pressure reduced average within-sample richness (by 17.01%), total abundance (16.5%) and rarefaction-based richness (14.5%). Business-as-usual scenarios are the widely practiced in the region and moreover, due to economic constraints country can not afford any mitigation scenarios. We project that biodiversity loss and ecosystem service impairment will spread in the region through ground water, soil, plants, animals and microorganisms at the rate of 1km/year. Entire Tian-Shan mountain chain will be in danger within next 5-10 years. Our preliminary data shows that local people live in this area developed various forms of cancer, and the rate of premature death is as high as 40%. Strong international scientific and socio-economic partnership is needed to develop models and predictions.

  12. Cell types differ in global coordination of splicing and proportion of highly expressed genes.

    PubMed

    Trakhtenberg, Ephraim F; Pho, Nam; Holton, Kristina M; Chittenden, Thomas W; Goldberg, Jeffrey L; Dong, Lingsheng

    2016-01-01

    Balance in the transcriptome is regulated by coordinated synthesis and degradation of RNA molecules. Here we investigated whether mammalian cell types intrinsically differ in global coordination of gene splicing and expression levels. We analyzed RNA-seq transcriptome profiles of 8 different purified mouse cell types. We found that different cell types vary in proportion of highly expressed genes and the number of alternatively spliced transcripts expressed per gene, and that the cell types that express more variants of alternatively spliced transcripts per gene are those that have higher proportion of highly expressed genes. Cell types segregated into two clusters based on high or low proportion of highly expressed genes. Biological functions involved in negative regulation of gene expression were enriched in the group of cell types with low proportion of highly expressed genes, and biological functions involved in regulation of transcription and RNA splicing were enriched in the group of cell types with high proportion of highly expressed genes. Our findings show that cell types differ in proportion of highly expressed genes and the number of alternatively spliced transcripts expressed per gene, which represent distinct properties of the transcriptome and may reflect intrinsic differences in global coordination of synthesis, splicing, and degradation of RNA molecules. PMID:27577089

  13. Cell types differ in global coordination of splicing and proportion of highly expressed genes

    PubMed Central

    Trakhtenberg, Ephraim F.; Pho, Nam; Holton, Kristina M.; Chittenden, Thomas W.; Goldberg, Jeffrey L.; Dong, Lingsheng

    2016-01-01

    Balance in the transcriptome is regulated by coordinated synthesis and degradation of RNA molecules. Here we investigated whether mammalian cell types intrinsically differ in global coordination of gene splicing and expression levels. We analyzed RNA-seq transcriptome profiles of 8 different purified mouse cell types. We found that different cell types vary in proportion of highly expressed genes and the number of alternatively spliced transcripts expressed per gene, and that the cell types that express more variants of alternatively spliced transcripts per gene are those that have higher proportion of highly expressed genes. Cell types segregated into two clusters based on high or low proportion of highly expressed genes. Biological functions involved in negative regulation of gene expression were enriched in the group of cell types with low proportion of highly expressed genes, and biological functions involved in regulation of transcription and RNA splicing were enriched in the group of cell types with high proportion of highly expressed genes. Our findings show that cell types differ in proportion of highly expressed genes and the number of alternatively spliced transcripts expressed per gene, which represent distinct properties of the transcriptome and may reflect intrinsic differences in global coordination of synthesis, splicing, and degradation of RNA molecules. PMID:27577089

  14. Molecular Networking and Pattern-Based Genome Mining Improves discovery of biosynthetic gene clusters and their products from Salinispora species

    PubMed Central

    Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; Sarkar, Anindita; Li, Jie; Ziemert, Nadine; Wang, Mingxun; Bandeira, Nuno; Moore, Bradley S.; Dorrestein, Pieter C.; Jensen, Paul R.

    2015-01-01

    Summary Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. Here we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated the identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. These efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches. PMID:25865308

  15. A GLOBAL METHANE EMISSIONS PROGRAM FOR LANDFILLS, COAL MINES, AND NATURAL GAS SYSTEMS

    EPA Science Inventory

    The paper gives the scope and methodology of EPA/AEERL's methane emissions studies and discloses data accumulated thus far in the program. Anthropogenic methane emissions are a principal focus in AEERL's global climate research program, including three major sources: municipal so...

  16. Global demand for rare earth resources and strategies for green mining

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Rare earths elements (REEs) are essential raw materials for the emerging green (low-carbon) energy technologies and ‘smart’ electronic devices. Global REE demand is slated to grow at a compound annual rate of 5% by 2020. Such high growth rate would require a steady supply base of REEs in the long ru...

  17. SNP Mining in Functional Genes from Nonmodel Species by Next-Generation Sequencing: A Case of Flowering, Pre-Harvest Sprouting, and Dehydration Resistant Genes in Wheat.

    PubMed

    Chen, Zhong-Xu; Deng, Mei; Wang, Ji-Rui

    2016-01-01

    As plenty of nonmodel plants are without genomic sequences, the combination of molecular technologies and the next generation sequencing (NGS) platform has led to a new approach to study the genetic variations of these plants. Software GATK, SOAPsnp, samtools, and others are often used to deal with the NGS data. In this study, BLAST was applied to call SNPs from 16 mixed functional gene's sequence data of polyploidy wheat. In total 1.2 million reads were obtained with the average of 7500 reads per genes. To get accurate information, 390,992 pair reads were successfully assembled before aligning to those functional genes. Standalone BLAST tools were used to map assembled sequence to functional genes, respectively. Polynomial fitting was applied to find the suitable minor allele frequency (MAF) threshold at 6% for assembled reads of each functional gene. SNPs accuracy form assembled reads, pretrimmed reads, and original reads were compared, which declared that SNPs mined from the assembled reads were more reliable than others. It was also demonstrated that mixed samples' NGS sequences and then analysis by BLAST were an effective, low-cost, and accurate way to mine SNPs for nonmodel species. Assembled reads and polynomial fitting threshold were recommended for more accurate SNPs target. PMID:27051662

  18. SNP Mining in Functional Genes from Nonmodel Species by Next-Generation Sequencing: A Case of Flowering, Pre-Harvest Sprouting, and Dehydration Resistant Genes in Wheat

    PubMed Central

    Chen, Zhong-Xu; Deng, Mei

    2016-01-01

    As plenty of nonmodel plants are without genomic sequences, the combination of molecular technologies and the next generation sequencing (NGS) platform has led to a new approach to study the genetic variations of these plants. Software GATK, SOAPsnp, samtools, and others are often used to deal with the NGS data. In this study, BLAST was applied to call SNPs from 16 mixed functional gene's sequence data of polyploidy wheat. In total 1.2 million reads were obtained with the average of 7500 reads per genes. To get accurate information, 390,992 pair reads were successfully assembled before aligning to those functional genes. Standalone BLAST tools were used to map assembled sequence to functional genes, respectively. Polynomial fitting was applied to find the suitable minor allele frequency (MAF) threshold at 6% for assembled reads of each functional gene. SNPs accuracy form assembled reads, pretrimmed reads, and original reads were compared, which declared that SNPs mined from the assembled reads were more reliable than others. It was also demonstrated that mixed samples' NGS sequences and then analysis by BLAST were an effective, low-cost, and accurate way to mine SNPs for nonmodel species. Assembled reads and polynomial fitting threshold were recommended for more accurate SNPs target. PMID:27051662

  19. The use and re-use of unsustainably mined groundwater: A global budget

    NASA Astrophysics Data System (ADS)

    Grogan, D. S.; Prousevitch, A.; Wisser, D.; Lammers, R. B.; Frolking, S. E.

    2015-12-01

    Many of the world's major groundwater aquifers are rapidly depleting due to unsustainable groundwater pumping, while demand for food production - and therefore demand for irrigation water ­- is increasing. While it is likely that groundwater users will be impacted by the future's inevitable reduction in groundwater availability, there is a major gap in our understanding of potential impacts downstream of pumping sites. Due to inefficiencies in irrigation systems, significant amounts of abstracted groundwater become runoff, entering surface waters and flowing downstream to be re-abstracted and used again. In this study, we use a gridded water balance model to calculate the amount of unsustainably pumped groundwater that enters surface water systems by way of irrigation runoff, and quantify the additional irrigation water supplied by the re-use of this water. We assess the global budget of unsustainable groundwater sources and sinks, including downstream re-use, groundwater recharge, and flow to the oceans. Globally, we find that 80% of unsustainable groundwater is re-abstracted for irrigation either downstream or locally from groundwater recharge. This re-abstracted water contributes the water equivalent needed to irrigate 200,000 km2 of cropland globally. Including irrigation runoff reuse in an assessment of irrigation efficiency, we see that the traditional concept of irrigation efficiency (net irrigation/gross irrigation) significantly overestimates water "waste". We define a basin efficiency for unsustainable groundwater use that includes re-use, and see that while global irrigation efficiency is often estimated at 50%, global average unsustainable water use efficiency is > 60%. Losing this re-use resource by increasing irrigation efficiency does little to alleviate unsustainable groundwater demands.

  20. TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining

    PubMed Central

    Fang, Yu-Ching; Huang, Hsuan-Cheng; Chen, Hsin-Hsi; Juan, Hsueh-Fen

    2008-01-01

    Background Traditional Chinese Medicine (TCM), a complementary and alternative medical system in Western countries, has been used to treat various diseases over thousands of years in East Asian countries. In recent years, many herbal medicines were found to exhibit a variety of effects through regulating a wide range of gene expressions or protein activities. As available TCM data continue to accumulate rapidly, an urgent need for exploring these resources systematically is imperative, so as to effectively utilize the large volume of literature. Methods TCM, gene, disease, biological pathway and protein-protein interaction information were collected from public databases. For association discovery, the TCM names, gene names, disease names, TCM ingredients and effects were used to annotate the literature corpus obtained from PubMed. The concept to mine entity associations was based on hypothesis testing and collocation analysis. The annotated corpus was processed with natural language processing tools and rule-based approaches were applied to the sentences for extracting the relations between TCM effecters and effects. Results We developed a database, TCMGeneDIT, to provide association information about TCMs, genes, diseases, TCM effects and TCM ingredients mined from vast amount of biomedical literature. Integrated protein-protein interaction and biological pathways information are also available for exploring the regulations of genes associated with TCM curative effects. In addition, the transitive relationships among genes, TCMs and diseases could be inferred through the shared intermediates. Furthermore, TCMGeneDIT is useful in understanding the possible therapeutic mechanisms of TCMs via gene regulations and deducing synergistic or antagonistic contributions of the prescription components to the overall therapeutic effects. The database is now available at . Conclusion TCMGeneDIT is a unique database that offers diverse association information on TCMs. This

  1. Mining Metatranscriptomic Data of a Cyanobacterial Bloom for Patterns of Secondary Metabolism Gene Expression

    NASA Astrophysics Data System (ADS)

    Penn, K.; Wang, J.; Thompson, J. R.

    2012-12-01

    The secondary metabolism of bacterial cells produces small molecules that can have both medicinal properties and toxigenic effects. This study focuses on mining metatranscriptomes from a tropical eutrophic water reservoir in Singapore experiencing a cyanobacterial Harmful Algal Bloom dominated by Microcystis, to identify the types of secondary metabolites genes being expressed and by what taxa. A phylogenomic approach as implemented in the online tool Natural Product Domain Seeker (NaPDoS) was used. NaPDoS was recently developed to classify ketosynthase and condensation domains from polyketide synthases and non-ribosomal peptide synthetases, respectively, to provide insight into potential types of pathway products. Water samples from the reservoir were collected six times over a day/night cycle. Total RNA was extracted and subjected to ribosomal depletion followed by cDNA synthesis and next-generation Illumina DNA sequencing, generating 493,468 to 678,064 95-101 base pairs post-quality control reads per sample. Evidence for expression of PKS and NRPS type genes based on identification of a ketosynthase and condensation domains are present in all time points. KS domains fall into to two main phylogenetic groups, type I and type II, within the type II group of domains are domains for fatty acid biosynthesis (fab), which is considered a part of primary metabolism. Type I KS domains are part of the classic PKS natural product biosynthetic genes that make things such as antibiotics and other toxins such as microcystin. 2849 KS domains were detected in the combined reservoir samples, of these 1141 were likely from fatty acid biosynthesis and 1708 were related to secondary metabolism type KS domains. The most abundant KS domains (485) besides the fab genes are closely related to a KS domain that is not currently experimentally linked to a known secondary metabolite but the domain is found in four Microcystis genomes along with two other species of cyanobacteria. The three

  2. Isolation and characterisation of mineral-oxidising "Acidibacillus" spp. from mine sites and geothermal environments in different global locations.

    PubMed

    Holanda, Roseanne; Hedrich, Sabrina; Ňancucheo, Ivan; Oliveira, Guilherme; Grail, Barry M; Johnson, D Barrie

    2016-09-01

    Eight strains of acidophilic bacteria, isolated from mine-impacted and geothermal sites from different parts of the world, were shown to form a distinct clade (proposed genus "Acidibacillus") within the phylum Firmicutes, well separated from the acidophilic genera Sulfobacillus and Alicyclobacillus. Two of the strains (both isolated from sites in Yellowstone National Park, USA) were moderate thermophiles that oxidised both ferrous iron and elemental sulphur, while the other six were mesophiles that also oxidised ferrous iron, but not sulphur. All eight isolates reduced ferric iron to varying degrees. The two groups shared <95% similarity of their 16S rRNA genes and were therefore considered to be distinct species: "Acidibacillus sulfuroxidans" (moderately thermophilic isolates) and "Acidibacillus ferrooxidans" (mesophilic isolates). Both species were obligate heterotrophs; none of the eight strains grew in the absence of organic carbon. "Acidibacillus" spp. were generally highly tolerant of elevated concentrations of cationic transition metals, though "A. sulfuroxidans" strains were more sensitive to some (e.g. nickel and zinc) than those of "A. ferrooxidans". Initial annotation of the genomes of two strains of "A. ferrooxidans" revealed the presence of genes (cbbL) involved in the RuBisCO pathway for CO2 assimilation and iron oxidation (rus), though with relatively low sequence identities. PMID:27154030

  3. Constructing a molecular interaction network for thyroid cancer via large-scale text mining of gene and pathway events

    PubMed Central

    2015-01-01

    Background Biomedical studies need assistance from automated tools and easily accessible data to address the problem of the rapidly accumulating literature. Text-mining tools and curated databases have been developed to address such needs and they can be applied to improve the understanding of molecular pathogenesis of complex diseases like thyroid cancer. Results We have developed a system, PWTEES, which extracts pathway interactions from the literature utilizing an existing event extraction tool (TEES) and pathway named entity recognition (PathNER). We then applied the system on a thyroid cancer corpus and systematically extracted molecular interactions involving either genes or pathways. With the extracted information, we constructed a molecular interaction network taking genes and pathways as nodes. Using curated pathway information and network topological analyses, we highlight key genes and pathways involved in thyroid carcinogenesis. Conclusions Mining events involving genes and pathways from the literature and integrating curated pathway knowledge can help improve the understanding of molecular interactions of complex diseases. The system developed for this study can be applied in studies other than thyroid cancer. The source code is freely available online at https://github.com/chengkun-wu/PWTEES. PMID:26679379

  4. Wheat gene bank accessions as a source of new alleles of the powdery mildew resistance gene Pm3: a large scale allele mining project

    PubMed Central

    2010-01-01

    Background In the last hundred years, the development of improved wheat cultivars has led to the replacement of landraces and traditional varieties by modern cultivars. This has resulted in a decline in the genetic diversity of agriculturally used wheat. However, the diversity lost in the elite material is somewhat preserved in crop gene banks. Therefore, the gene bank accessions provide the basis for genetic improvement of crops for specific traits and and represent rich sources of novel allelic variation. Results We have undertaken large scale molecular allele mining to isolate new alleles of the powdery mildew resistance gene Pm3 from wheat gene bank accessions. The search for new Pm3 alleles was carried out on a geographically diverse set of 733 wheat accessions originating from 20 countries. Pm3 specific molecular tools as well as classical pathogenicity tests were used to characterize the accessions. Two new functional Pm3 alleles were identified out of the eight newly cloned Pm3 sequences. These new resistance alleles were isolated from accessions from China and Nepal. Thus, the repertoire of functional Pm3 alleles now includes 17 genes, making it one of the largest allelic series of plant resistance genes. The combined information on resistant and susceptible Pm3 sequences will allow to study molecular function and specificity of functional Pm3 alleles. Conclusions This study demonstrates that molecular allele mining on geographically defined accessions is a useful strategy to rapidly characterize the diversity of gene bank accessions at a specific genetic locus of agronomical importance. The identified wheat accessions with new resistance specificities can be used for marker-assisted transfer of the Pm3 alleles to modern wheat lines. PMID:20470444

  5. Re-engineering cellular physiology by rewiring high-level global regulatory genes

    PubMed Central

    Fitzgerald, Stephen; Dillon, Shane C.; Chao, Tzu-Chiao; Wiencko, Heather L.; Hokamp, Karsten; Cameron, Andrew D. S.; Dorman, Charles J.

    2015-01-01

    Knowledge of global regulatory networks has been exploited to rewire the gene control programmes of the model bacterium Salmonella enterica serovar Typhimurium. The product is an organism with competitive fitness that is superior to that of the wild type but tuneable under specific growth conditions. The paralogous hns and stpA global regulatory genes are located in distinct regions of the chromosome and control hundreds of target genes, many of which contribute to stress resistance. The locations of the hns and stpA open reading frames were exchanged reciprocally, each acquiring the transcription control signals of the other. The new strain had none of the compensatory mutations normally associated with alterations to hns expression in Salmonella; instead it displayed rescheduled expression of the stress and stationary phase sigma factor RpoS and its regulon. Thus the expression patterns of global regulators can be adjusted artificially to manipulate microbial physiology, creating a new and resilient organism. PMID:26631971

  6. Re-engineering cellular physiology by rewiring high-level global regulatory genes.

    PubMed

    Fitzgerald, Stephen; Dillon, Shane C; Chao, Tzu-Chiao; Wiencko, Heather L; Hokamp, Karsten; Cameron, Andrew D S; Dorman, Charles J

    2015-01-01

    Knowledge of global regulatory networks has been exploited to rewire the gene control programmes of the model bacterium Salmonella enterica serovar Typhimurium. The product is an organism with competitive fitness that is superior to that of the wild type but tuneable under specific growth conditions. The paralogous hns and stpA global regulatory genes are located in distinct regions of the chromosome and control hundreds of target genes, many of which contribute to stress resistance. The locations of the hns and stpA open reading frames were exchanged reciprocally, each acquiring the transcription control signals of the other. The new strain had none of the compensatory mutations normally associated with alterations to hns expression in Salmonella; instead it displayed rescheduled expression of the stress and stationary phase sigma factor RpoS and its regulon. Thus the expression patterns of global regulators can be adjusted artificially to manipulate microbial physiology, creating a new and resilient organism. PMID:26631971

  7. Global and gene-specific DNA methylation pattern discriminates cholecystitis from gallbladder cancer patients in Chile

    PubMed Central

    Kagohara, Luciane Tsukamoto; Schussel, Juliana L; Subbannayya, Tejaswini; Sahasrabuddhe, Nandini; Lebron, Cynthia; Brait, Mariana; Maldonado, Leonel; Valle, Blanca L; Pirini, Francesca; Jahuira, Martha; Lopez, Jaime; Letelier, Pablo; Brebi-Mieville, Priscilla; Ili, Carmen; Pandey, Akhilesh; Chatterjee, Aditi; Sidransky, David; Guerrero-Preston, Rafael

    2015-01-01

    Aim The aim of the study was to evaluate the use of global and gene-specific DNA methylation changes as potential biomarkers for gallbladder cancer (GBC) in a cohort from Chile. Material & methods DNA methylation was analyzed through an ELISA-based technique and quantitative methylation-specific PCR. Results Global DNA Methylation Index (p = 0.02) and promoter methylation of SSBP2 (p = 0.01) and ESR1 (p = 0.05) were significantly different in GBC when compared with cholecystitis. Receiver curve operator analysis revealed promoter methylation of APC, CDKN2A, ESR1, PGP9.5 and SSBP2, together with the Global DNA Methylation Index, had 71% sensitivity, 95% specificity, a 0.97 area under the curve and a positive predictive value of 90%. Conclusion Global and gene-specific DNA methylation may be useful biomarkers for GBC clinical assessment. PMID:25066711

  8. Growth-rate dependent global effects on gene expression in bacteria

    PubMed Central

    Klumpp, Stefan; Zhang, Zhongge; Hwa, Terence

    2010-01-01

    Summary Bacterial gene expression depends not only on specific regulations but also directly on bacterial growth, because important global parameters such as the abundance of RNA polymerases and ribosomes are all growth-rate dependent. Understanding these global effects is necessary for a quantitative understanding of gene regulation and for the robust design of synthetic genetic circuits. The observed growth-rate dependence of constitutive gene expression can be explained by a simple model using the measured growth-rate dependence of the relevant cellular parameters. More complex growth dependences for genetic circuits involving activators, repressors and feedback control were analyzed, and salient features were verified experimentally using synthetic circuits. The results suggest a novel feedback mechanism mediated by general growth-dependent effects and not requiring explicit gene regulation, if the expressed protein affects cell growth. This mechanism can lead to growth bistability and promote the acquisition of important physiological functions such as antibiotic resistance and tolerance (persistence). PMID:20064380

  9. Global deceleration of gene evolution following recent genome hybridizations in fungi.

    PubMed

    Sriswasdi, Sira; Takashima, Masako; Manabe, Ri-Ichiroh; Ohkuma, Moriya; Sugita, Takashi; Iwasaki, Wataru

    2016-08-01

    Polyploidization events such as whole-genome duplication and inter-species hybridization are major evolutionary forces that shape genomes. Although long-term effects of polyploidization have been well-characterized, early molecular evolutionary consequences of polyploidization remain largely unexplored. Here, we report the discovery of two recent and independent genome hybridizations within a single clade of a fungal genus, Trichosporon Comparative genomic analyses revealed that redundant genes are experiencing decelerations, not accelerations, of evolutionary rates. We identified a relationship between gene conversion and decelerated evolution suggesting that gene conversion may improve the genome stability of young hybrids by restricting gene functional divergences. Furthermore, we detected large-scale gene losses from transcriptional and translational machineries that indicate a global compensatory mechanism against increased gene dosages. Overall, our findings illustrate counteracting mechanisms during an early phase of post-genome hybridization and fill a critical gap in existing theories on genome evolution. PMID:27440871

  10. Diversity and Distribution of Arsenic-Related Genes Along a Pollution Gradient in a River Affected by Acid Mine Drainage.

    PubMed

    Desoeuvre, Angélique; Casiot, Corinne; Héry, Marina

    2016-04-01

    Some microorganisms have the capacity to interact with arsenic through resistance or metabolic processes. Their activities contribute to the fate of arsenic in contaminated ecosystems. To investigate the genetic potential involved in these interactions in a zone of confluence between a pristine river and an arsenic-rich acid mine drainage, we explored the diversity of marker genes for arsenic resistance (arsB, acr3.1, acr3.2), methylation (arsM), and respiration (arrA) in waters characterized by contrasted concentrations of metallic elements (including arsenic) and pH. While arsB-carrying bacteria were representative of pristine waters, Acr3 proteins may confer to generalist bacteria the capacity to cope with an increase of contamination. arsM showed an unexpected wide distribution, suggesting biomethylation may impact arsenic fate in contaminated aquatic ecosystems. arrA gene survey suggested that only specialist microorganisms (adapted to moderately or extremely contaminated environments) have the capacity to respire arsenate. Their distribution, modulated by water chemistry, attested the specialist nature of the arsenate respirers. This is the first report of the impact of an acid mine drainage on the diversity and distribution of arsenic (As)-related genes in river waters. The fate of arsenic in this ecosystem is probably under the influence of the abundance and activity of specific microbial populations involved in different As biotransformations. PMID:26603631

  11. Application of global SST and SLP data for drought forecasting on Tehran plain using data mining and ANFIS techniques

    NASA Astrophysics Data System (ADS)

    Farokhnia, Ashkan; Morid, Saeed; Byun, Hi-Ryong

    2011-05-01

    Drought forecasting is a critical component of drought risk management. Identification of effective predictors is a major component of forecasting models. Sea surface temperature (SST) and sea level pressure (SLP) are relevant predictors for short- to long-term drought forecasts. However, these datasets are captured globally within a cell-wise network. This paper describes an approach to locate the most effective cells of the SST and SLP datasets using data mining. They are then applied as input to an adaptive neurofuzzy inference system (ANFIS) model to forecast possible droughts 3, 6, and 9 months in advance. Tehran plain was selected as the study area, and drought events are designated using the effective drought index (EDI). In another treatment, past values of the EDI time series were introduced to the ANFIS and the results compared with the previous findings. It was shown that R 2 values were higher for all cases applying the SST/SLP datasets. Additionally, the performance of SST/SLP datasets and the ANFIS model was assessed according to "drought" or "wet" classification, and it was concluded that more than 90% of the time the ANFIS model detected the drought status correctly or with only a one class error.

  12. Banking biological collections: data warehousing, data mining, and data dilemmas in genomics and global health policy.

    PubMed

    Blatt, R J R

    2000-01-01

    While DNA databases may offer the opportunity to (1) assess population-based prevalence of specific genes and variants, (2) simplify the search for molecular markers, (3) improve targeted drug discovery and development for disease management, (4) refine strategies for disease prevention, and (5) provide the data necessary for evidence-based decision-making, serious scientific and social questions remain. Whether samples are identified, coded, or anonymous, biological banking raises profound ethical and legal issues pertaining to access, informed consent, privacy and confidentiality of genomic information, civil liberties, patenting, and proprietary rights. This paper provides an overview of key policy issues and questions pertaining to biological banking, with a focus on developments in specimen collection, transnational distribution, and public health and academic-industry research alliances. It highlights the challenges posed by the commercialization of genomics, and proposes the need for harmonization of biological banking policies. PMID:11878344

  13. Polymorphisms in Genes Encoding Potential Mercury Transporters and Urine Mercury Concentrations in Populations Exposed to Mercury Vapor from Gold Mining

    PubMed Central

    Ameer, Shegufta; Bernaudat, Ludovic; Drasch, Gustav; Baeuml, Jennifer; Skerfving, Staffan; Bose-O’Reilly, Stephan; Broberg, Karin

    2012-01-01

    Background: Elemental mercury (Hg0) is widely used in small-scale gold mining. Persons working or living in mining areas have high urinary concentrations of Hg (U-Hg). Differences in genes encoding potential Hg-transporters may affect uptake and elimination of Hg. Objective: We aimed to identify single nucleotide polymorphisms (SNPs) in Hg-transporter genes that modify U-Hg. Methods: Men and women (1,017) from Indonesia, the Philippines, Tanzania, and Zimbabwe were classified either as controls (no Hg exposure from gold mining) or as having low (living in a gold-mining area) or high exposure (working as gold miners). U-Hg was analyzed by cold-vapor atomic absorption spectrometry. Eighteen SNPs in eight Hg-transporter genes were analyzed. Results: U-Hg concentrations were higher among ABCC2/MRP2 rs1885301 A–allele carriers than among GG homozygotes in all populations, though differences were not statistically significant in most cases. MRP2 SNPs showed particularly strong associations with U-Hg in the subgroup with highest exposure (miners in Zimbabwe), whereas rs1885301 A–allele carriers had higher U-Hg than GG homozygotes [geometric mean (GM): 36.4 µg/g creatinine vs. 21.9; p = 0.027], rs2273697 GG homozygotes had higher U-Hg than A–allele carriers (GM: 37.4 vs. 16.7; p = 0.001), and rs717620 A–allele carriers had higher U-Hg than GG homozygotes (GM: 83 vs. 28; p = 0.084). The SLC7A5/LAT1 rs33916661 GG genotype was associated with higher U-Hg in all populations (statistically significant for all Tanzanians combined). SNPs in SLC22A6/OAT1 (rs4149170) and SLC22A8/OAT3 (rs4149182) were associated with U-Hg mainly in the Tanzanian study groups. Conclusions: SNPs in putative Hg-transporter genes may influence U-Hg concentrations. PMID:23052037

  14. tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles

    PubMed Central

    Cejuela, Juan Miguel; McQuilton, Peter; Ponting, Laura; Marygold, Steven J.; Stefancsik, Raymund; Millburn, Gillian H.; Rost, Burkhard

    2014-01-01

    The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual curation and to automate some aspects, such as triaging and entity extraction. Toward this end, we present the ‘tagtog’ system, a web-based annotation framework that can be used to mark up biological entities (such as genes) and concepts (such as Gene Ontology terms) in full-text articles. tagtog leverages manual user annotation in combination with automatic machine-learned annotation to provide accurate identification of gene symbols and gene names. As part of the BioCreative IV Interactive Annotation Task, FlyBase has used tagtog to identify and extract mentions of Drosophila melanogaster gene symbols and names in full-text biomedical articles from the PLOS stable of journals. We show here the results of three experiments with different sized corpora and assess gene recognition performance and curation speed. We conclude that tagtog-named entity recognition improves with a larger corpus and that tagtog-assisted curation is quicker than manual curation. Database URL: www.tagtog.net, www.flybase.org PMID:24715220

  15. Gene expression profiling in Daphnia magna, part II: validation of a copper specific gene expression signature with effluent from two copper mines in California.

    PubMed

    Poynton, Helen C; Zuzow, Rick; Loguinov, Alexandre V; Perkins, Edward J; Vulpe, Chris D

    2008-08-15

    Genomic technologies show great potential for classifying disease states and toxicological impacts from exposure to chemicals into functional categories. In environmental monitoring, the ability to classify field samples and predict the pollutants present in these samples could contribute to monitoring efforts and the diagnosis of contaminated sites. Using gene expression analysis, we challenged our custom Daphnia magna cDNA microarray to determine the presence of a specific metal toxicant in blinded field samples collected from two copper mines in California. We compared the gene expression profiles from our field samples to previously established expression profiles for Cu, Cd, and Zn. The expression profiles from the Cu-containing field samples clustered with the laboratory-exposed Cu-specific gene expression profiles and included genes previously identified as copper biomarkers, verifying that gene expression analysis can predict environmental exposure to a specific pollutant. In addition, our study revealed that upstream field samples containing undetectable levels of Cu caused the differential expression of only a few genes, lending support for the concept of a no observed transcriptional effect level (NOTEL). If confirmed by further studies, the NOTEL may play an important role in discriminating polluted and nonpolluted sites in future monitoring efforts. PMID:18767696

  16. Mobile genes in the human microbiome are structured from global to individual scales.

    PubMed

    Brito, I L; Yilmaz, S; Huang, K; Xu, L; Jupiter, S D; Jenkins, A P; Naisilisili, W; Tamminen, M; Smillie, C S; Wortman, J R; Birren, B W; Xavier, R J; Blainey, P C; Singh, A K; Gevers, D; Alm, E J

    2016-07-21

    Recent work has underscored the importance of the microbiome in human health, and has largely attributed differences in phenotype to differences in the species present among individuals. However, mobile genes can confer profoundly different phenotypes on different strains of the same species. Little is known about the function and distribution of mobile genes in the human microbiome, and in particular whether the gene pool is globally homogenous or constrained by human population structure. Here, we investigate this question by comparing the mobile genes found in the microbiomes of 81 metropolitan North Americans with those of 172 agrarian Fiji islanders using a combination of single-cell genomics and metagenomics. We find large differences in mobile gene content between the Fijian and North American microbiomes, with functional variation that mirrors known dietary differences such as the excess of plant-based starch degradation genes found in Fijian individuals. Notably, we also observed differences between the mobile gene pools of neighbouring Fijian villages, even though microbiome composition across villages is similar. Finally, we observe high rates of recombination leading to individual-specific mobile elements, suggesting that the abundance of some genes may reflect environmental selection rather than dispersal limitation. Together, these data support the hypothesis that human activities and behaviours provide selective pressures that shape mobile gene pools, and that acquisition of mobile genes is important for colonizing specific human populations. PMID:27409808

  17. The facilitating roles and uses of gene banks in addressing the global plan of action

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Contractions of livestock genetic resources are occurring as countries strive to meet increasing demand for livestock products. The Global Plan of Action’s (GPA) Strategic Priority Area 3 – Conservation, calls for governments to establish gene banks for ex-situ cryogenic conservation. Establishment ...

  18. Benzo[a]pyrene decreases global and gene specific DNA methylation during zebrafish development

    Technology Transfer Automated Retrieval System (TEKTRAN)

    DNA methylation is important for gene regulation and is vulnerable to early-life exposure to environmental contaminants. We found that direct waterborne benzo[a]pyrene (BaP) exposure at 24 'g/L from 2.5 to 96 hours post fertilization (hpf) to zebrafish embryos significantly decreased global cytosine...

  19. Global transcription network incorporating distal regulator binding reveals selective cooperation of cancer drivers and risk genes

    PubMed Central

    Kim, Kwoneel; Yang, Woojin; Lee, Kang Seon; Bang, Hyoeun; Jang, Kiwon; Kim, Sang Cheol; Yang, Jin Ok; Park, Seongjin; Park, Kiejung; Choi, Jung Kyoon

    2015-01-01

    Global network modeling of distal regulatory interactions is essential in understanding the overall architecture of gene expression programs. Here, we developed a Bayesian probabilistic model and computational method for global causal network construction with breast cancer as a model. Whereas physical regulator binding was well supported by gene expression causality in general, distal elements in intragenic regions or loci distant from the target gene exhibited particularly strong functional effects. Modeling the action of long-range enhancers was critical in recovering true biological interactions with increased coverage and specificity overall and unraveling regulatory complexity underlying tumor subclasses and drug responses in particular. Transcriptional cancer drivers and risk genes were discovered based on the network analysis of somatic and genetic cancer-related DNA variants. Notably, we observed that the risk genes were functionally downstream of the cancer drivers and were selectively susceptible to network perturbation by tumorigenic changes in their upstream drivers. Furthermore, cancer risk alleles tended to increase the susceptibility of the transcription of their associated genes. These findings suggest that transcriptional cancer drivers selectively induce a combinatorial misregulation of downstream risk genes, and that genetic risk factors, mostly residing in distal regulatory regions, increase transcriptional susceptibility to upstream cancer-driving somatic changes. PMID:26001967

  20. A Global Survey of Gene Regulation during Cold Acclimation in Arabidopsis thaliana

    PubMed Central

    Hannah, Matthew A; Heyer, Arnd G; Hincha, Dirk K

    2005-01-01

    Many temperate plant species such as Arabidopsis thaliana are able to increase their freezing tolerance when exposed to low, nonfreezing temperatures in a process called cold acclimation. This process is accompanied by complex changes in gene expression. Previous studies have investigated these changes but have mainly focused on individual or small groups of genes. We present a comprehensive statistical analysis of the genome-wide changes of gene expression in response to 14 d of cold acclimation in Arabidopsis, and provide a large-scale validation of these data by comparing datasets obtained for the Affymetrix ATH1 Genechip and MWG 50-mer oligonucleotide whole-genome microarrays. We combine these datasets with existing published and publicly available data investigating Arabidopsis gene expression in response to low temperature. All data are integrated into a database detailing the cold responsiveness of 22,043 genes as a function of time of exposure at low temperature. We concentrate our functional analysis on global changes marking relevant pathways or functional groups of genes. These analyses provide a statistical basis for many previously reported changes, identify so far unreported changes, and show which processes predominate during different times of cold acclimation. This approach offers the fullest characterization of global changes in gene expression in response to low temperature available to date. PMID:16121258

  1. A Spatio-temporal Data Mining Approach to Global scale Burned Area Monitoring

    NASA Astrophysics Data System (ADS)

    Mithal, V.; Khandelwal, A.; Nayak, G.; Kumar, V.; Nemani, R. R.; Oza, N.

    2014-12-01

    We present a novel technique for burned area mapping in forests using the Enhanced Vegetation Index (EVI) from the MODIS 16-day Level 3 1km Vegetation Indices (MOD13A2) and the Active Fire (AF) from the MODIS 8-day Level 3 1km Thermal Anomalies and Fire products (MOD14A2). The proposed method leverages the spatial and temporal co-occurrence of thermal anomalies and vegetation loss caused due to forest fires to detect burned areas. Our approach derives features from Enhanced Vegetation Index that target locations which show an abrupt change in their vegetation time series that take at least several months to recover. One unique aspect of our approach is that it uses data from multiple months around the fire event and is therefore more robust to issues in data quality. Comparison with other burned area products show that our approach detects several large previously undetected burned areas across multiple geographical regions. In particular, we found that our approach detects several large burned regions in the tropical forests of Indonesia and South America that had been missed by the state-of-arts burned area approaches. For example, using our approach in Indonesia we discovered that the state-of-the-art MODIS Burned area product had missed around 20,000 sq. km. of burned area (nearly as much burned area as it has reported). We show that all these previously unreported burned areas detected by our approach are actually significant fires which suffered a large, abrupt loss in their vegetation at the time of the fire event and take at least several months to recover back to their normal vegetation. To evaluate these burned areas we compared the Landsat-based composites before and after the date of the event. Our Landsat analysis shows that the burned areas detected by the proposed approach are true burns with a very small error of commission. We believe our work has the potential to provide a scalable approach to global forest monitoring as well as reduce the

  2. Krylov subspace algorithms for computing GeneRank for the analysis of microarray data mining.

    PubMed

    Wu, Gang; Zhang, Ying; Wei, Yimin

    2010-04-01

    GeneRank is a new engine technology for the analysis of microarray experiments. It combines gene expression information with a network structure derived from gene notations or expression profile correlations. Using matrix decomposition techniques, we first give a matrix analysis of the GeneRank model. We reformulate the GeneRank vector as a linear combination of three parts in the general case when the matrix in question is non-diagonalizable. We then propose two Krylov subspace methods for computing GeneRank. Numerical experiments show that, when the GeneRank problem is very large, the new algorithms are appropriate choices. PMID:20426695

  3. The impact of endurance exercise on global and AMPK gene-specific DNA methylation.

    PubMed

    King-Himmelreich, Tanya S; Schramm, Stefanie; Wolters, Miriam C; Schmetzer, Julia; Möser, Christine V; Knothe, Claudia; Resch, Eduard; Peil, Johannes; Geisslinger, Gerd; Niederberger, Ellen

    2016-05-27

    Alterations in gene expression as a consequence of physical exercise are frequently described. The mechanism of these regulations might depend on epigenetic changes in global or gene-specific DNA methylation levels. The AMP-activated protein kinase (AMPK) plays a key role in maintenance of energy homeostasis and is activated by increases in the AMP/ATP ratio as occurring in skeletal muscles after sporting activity. To analyze whether exercise has an impact on the methylation status of the AMPK promoter, we determined the AMPK methylation status in human blood samples from patients before and after sporting activity in the context of rehabilitation as well as in skeletal muscles of trained and untrained mice. Further, we examined long interspersed nuclear element 1 (LINE-1) as indicator of global DNA methylation changes. Our results revealed that light sporting activity in mice and humans does not alter global DNA methylation but has an effect on methylation of specific CpG sites in the AMPKα2 gene. These regulations were associated with a reduced AMPKα2 mRNA and protein expression in muscle tissue, pointing at a contribution of the methylation status to AMPK expression. Taken together, these results suggest that exercise influences AMPKα2 gene methylation in human blood and eminently in the skeletal muscle of mice and therefore might repress AMPKα2 gene expression. PMID:27103439

  4. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins.

    PubMed

    Rouillard, Andrew D; Gundersen, Gregory W; Fernandez, Nicolas F; Wang, Zichen; Monteiro, Caroline D; McDermott, Michael G; Ma'ayan, Avi

    2016-01-01

    Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into ∼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene-gene and attribute-attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge about

  5. Integrated protein function prediction by mining function associations, sequences, and protein–protein and gene–gene interaction networks

    PubMed Central

    Cao, Renzhi; Cheng, Jianlin

    2016-01-01

    Motivations Protein function prediction is an important and challenging problem in bioinformatics and computational biology. Functionally relevant biological information such as protein sequences, gene expression, and protein–protein interactions has been used mostly separately for protein function prediction. One of the major challenges is how to effectively integrate multiple sources of both traditional and new information such as spatial gene–gene interaction networks generated from chromosomal conformation data together to improve protein function prediction. Results In this work, we developed three different probabilistic scores (MIS, SEQ, and NET score) to combine protein sequence, function associations, and protein–protein interaction and spatial gene–gene interaction networks for protein function prediction. The MIS score is mainly generated from homologous proteins found by PSI-BLAST search, and also association rules between Gene Ontology terms, which are learned by mining the Swiss-Prot database. The SEQ score is generated from protein sequences. The NET score is generated from protein–protein interaction and spatial gene–gene interaction networks. These three scores were combined in a new Statistical Multiple Integrative Scoring System (SMISS) to predict protein function. We tested SMISS on the data set of 2011 Critical Assessment of Function Annotation (CAFA). The method performed substantially better than three base-line methods and an advanced method based on protein profile–sequence comparison, profile–profile comparison, and domain co-occurrence networks according to the maximum F-measure. PMID:26370280

  6. USING PharmGKB TO TRAIN TEXT MINING APPROACHES FOR IDENTIFYING POTENTIAL GENE TARGETS FOR PHARMACOGENOMIC STUDIES

    PubMed Central

    PAKHOMOV, S.; MCINNES, B.T.; LAMBA, J.; LIU, Y.; MELTON, G.B.; GHODKE, Y.; BHISE, N.; LAMBA, V.; BIRNBAUM, A.K.

    2012-01-01

    The main objective of this study was to investigate the feasibility of using PharmGKB, a pharmacogenomic database, as a source of training data in combination with text of MEDLINE abstracts for a text mining approach to identification of potential gene targets for pathway-driven pharmacogenomics research. We used the manually curated relations between drugs and genes in PharmGKB database to train a support vector machine predictive model and applied this model prospectively to MEDLINE abstracts. The gene targets suggested by this approach were subsequently manually reviewed. Our quantitative analysis showed that a support vector machine classifiers trained on MEDLINE abstracts with single words (unigrams) used as features and PharmGKB relations used for supervision, achieve an overall sensitivity of 85% and specificity of 69%. The subsequent qualitative analysis showed that gene targets “suggested” by the automatic classifier were not anticipated by expert reviewers but were subsequently found to be relevant to the three drugs that were investigated: carbamazepine, lamivudine and zidovudine. Our results show that this approach is not only feasible but may also find new gene targets not identifiable by other methods thus making it a valuable tool for pathway-driven pharmacogenomics research. PMID:22564551

  7. GEM-TREND: a web tool for gene expression data mining toward relevant network discovery

    PubMed Central

    Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

    2009-01-01

    Background DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. Results GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations

  8. Global variability in gene expression and alternative splicing is modulated by mitochondrial content.

    PubMed

    Guantes, Raul; Rastrojo, Alberto; Neves, Ricardo; Lima, Ana; Aguado, Begoña; Iborra, Francisco J

    2015-05-01

    Noise in gene expression is a main determinant of phenotypic variability. Increasing experimental evidence suggests that genome-wide cellular constraints largely contribute to the heterogeneity observed in gene products. It is still unclear, however, which global factors affect gene expression noise and to what extent. Since eukaryotic gene expression is an energy demanding process, differences in the energy budget of each cell could determine gene expression differences. Here, we quantify the contribution of mitochondrial variability (a natural source of ATP variation) to global variability in gene expression. We find that changes in mitochondrial content can account for ∼50% of the variability observed in protein levels. This is the combined result of the effect of mitochondria dosage on transcription and translation apparatus content and activities. Moreover, we find that mitochondrial levels have a large impact on alternative splicing, thus modulating both the abundance and type of mRNAs. A simple mathematical model in which mitochondrial content simultaneously affects transcription rate and splicing site choice can explain the alternative splicing data. The results of this study show that mitochondrial content (and/or probably function) influences mRNA abundance, translation, and alternative splicing, which ultimately affects cellular phenotype. PMID:25800673

  9. Global effects on gene expression in fission yeast by silencing and RNA interference machineries.

    PubMed

    Hansen, Klavs R; Burns, Gavin; Mata, Juan; Volpe, Thomas A; Martienssen, Robert A; Bähler, Jürg; Thon, Geneviève

    2005-01-01

    Histone modifications influence gene expression in complex ways. The RNA interference (RNAi) machinery can repress transcription by recruiting histone-modifying enzymes to chromatin, although it is not clear whether this is a general mechanism for gene silencing or whether it requires repeated sequences such as long terminal repeats (LTRs). We analyzed the global effects of the Clr3 and Clr6 histone deacetylases, the Clr4 methyltransferase, the zinc finger protein Clr1, and the RNAi proteins Dicer, RdRP, and Argonaute on the transcriptome of Schizosaccharomyces pombe (fission yeast). The clr mutants derepressed similar subsets of genes, many of which also became transcriptionally activated in cells that were exposed to environmental stresses such as nitrogen starvation. Many genes that were repressed by the Clr proteins clustered in extended regions close to the telomeres. Surprisingly few genes were repressed by both the silencing and RNAi machineries, with transcripts from centromeric repeats and Tf2 retrotransposons being notable exceptions. We found no correlation between repression by RNAi and proximity to LTRs, and the wtf family of repeated sequences seems to be repressed by histone deacetylation independent of RNAi. Our data indicate that the RNAi and Clr proteins show only a limited functional overlap and that the Clr proteins play more global roles in gene silencing. PMID:15632061

  10. Global gene expression of Poncirus trifoliata, Citrus sunki and their hybrids under infection of Phytophthora parasitica

    PubMed Central

    2011-01-01

    Background Gummosis and root rot caused by Phytophthora are among the most economically important diseases in citrus. Four F1 resistant hybrids (Pool R), and four F1 susceptible hybrids (Pool S) to P. parasitica, were selected from a cross between susceptible Citrus sunki and resistant Poncirus trifoliata cv. Rubidoux. We investigated gene expression in pools of four resistant and four susceptible hybrids in comparison with their parents 48 hours after P. parasitica inoculation. We proposed that genes differentially expressed between resistant and susceptible parents and between their resistant and susceptible hybrids provide promising candidates for identifying transcripts involved in disease resistance. A microarray containing 62,876 UniGene transcripts selected from the CitEST database and prepared by NimbleGen Systems was used for analyzing global gene expression 48 hours after infection with P. parasitica. Results Three pairs of data comparisons (P. trifoliata/C. sunki, Pool R/C. sunki and Pool R/Pool S) were performed. With a filter of false-discovery rate less than 0.05 and fold change greater than 3.0, 21 UniGene transcripts common to the three pairwise comparative were found to be up-regulated, and 3 UniGene transcripts were down-regulated. Among them, our results indicated that the selected transcripts were probably involved in the whole process of plant defense responses to pathogen attack, including transcriptional regulation, signaling, activation of defense genes participating in HR, single dominant genes (R gene) such as TIR-NBS-LRR and RPS4 and switch of defense-related metabolism pathway. Differentially expressed genes were validated by RT-qPCR in susceptible and resistant plants and between inoculated and uninoculated control plants Conclusions Twenty four UniGene transcripts were identified as candidate genes for Citrus response to P. parasitica. UniGene transcripts were likely to be involved in disease resistance, such as genes potentially

  11. Global Gene-Expression Analysis to Identify Differentially Expressed Genes Critical for the Heat Stress Response in Brassica rapa.

    PubMed

    Dong, Xiangshu; Yi, Hankuil; Lee, Jeongyeo; Nou, Ill-Sup; Han, Ching-Tack; Hur, Yoonkang

    2015-01-01

    Genome-wide dissection of the heat stress response (HSR) is necessary to overcome problems in crop production caused by global warming. To identify HSR genes, we profiled gene expression in two Chinese cabbage inbred lines with different thermotolerances, Chiifu and Kenshin. Many genes exhibited >2-fold changes in expression upon exposure to 0.5- 4 h at 45°C (high temperature, HT): 5.2% (2,142 genes) in Chiifu and 3.7% (1,535 genes) in Kenshin. The most enriched GO (Gene Ontology) items included 'response to heat', 'response to reactive oxygen species (ROS)', 'response to temperature stimulus', 'response to abiotic stimulus', and 'MAPKKK cascade'. In both lines, the genes most highly induced by HT encoded small heat shock proteins (Hsps) and heat shock factor (Hsf)-like proteins such as HsfB2A (Bra029292), whereas high-molecular weight Hsps were constitutively expressed. Other upstream HSR components were also up-regulated: ROS-scavenging genes like glutathione peroxidase 2 (BrGPX2, Bra022853), protein kinases, and phosphatases. Among heat stress (HS) marker genes in Arabidopsis, only exportin 1A (XPO1A) (Bra008580, Bra006382) can be applied to B. rapa for basal thermotolerance (BT) and short-term acquired thermotolerance (SAT) gene. CYP707A3 (Bra025083, Bra021965), which is involved in the dehydration response in Arabidopsis, was associated with membrane leakage in both lines following HS. Although many transcription factors (TF) genes, including DREB2A (Bra005852), were involved in HS tolerance in both lines, Bra024224 (MYB41) and Bra021735 (a bZIP/AIR1 [Anthocyanin-Impaired-Response-1]) were specific to Kenshin. Several candidate TFs involved in thermotolerance were confirmed as HSR genes by real-time PCR, and these assignments were further supported by promoter analysis. Although some of our findings are similar to those obtained using other plant species, clear differences in Brassica rapa reveal a distinct HSR in this species. Our data could also provide a

  12. Global Gene-Expression Analysis to Identify Differentially Expressed Genes Critical for the Heat Stress Response in Brassica rapa

    PubMed Central

    Dong, Xiangshu; Yi, Hankuil; Lee, Jeongyeo; Nou, Ill-Sup; Han, Ching-Tack; Hur, Yoonkang

    2015-01-01

    Genome-wide dissection of the heat stress response (HSR) is necessary to overcome problems in crop production caused by global warming. To identify HSR genes, we profiled gene expression in two Chinese cabbage inbred lines with different thermotolerances, Chiifu and Kenshin. Many genes exhibited >2-fold changes in expression upon exposure to 0.5– 4 h at 45°C (high temperature, HT): 5.2% (2,142 genes) in Chiifu and 3.7% (1,535 genes) in Kenshin. The most enriched GO (Gene Ontology) items included ‘response to heat’, ‘response to reactive oxygen species (ROS)’, ‘response to temperature stimulus’, ‘response to abiotic stimulus’, and ‘MAPKKK cascade’. In both lines, the genes most highly induced by HT encoded small heat shock proteins (Hsps) and heat shock factor (Hsf)-like proteins such as HsfB2A (Bra029292), whereas high-molecular weight Hsps were constitutively expressed. Other upstream HSR components were also up-regulated: ROS-scavenging genes like glutathione peroxidase 2 (BrGPX2, Bra022853), protein kinases, and phosphatases. Among heat stress (HS) marker genes in Arabidopsis, only exportin 1A (XPO1A) (Bra008580, Bra006382) can be applied to B. rapa for basal thermotolerance (BT) and short-term acquired thermotolerance (SAT) gene. CYP707A3 (Bra025083, Bra021965), which is involved in the dehydration response in Arabidopsis, was associated with membrane leakage in both lines following HS. Although many transcription factors (TF) genes, including DREB2A (Bra005852), were involved in HS tolerance in both lines, Bra024224 (MYB41) and Bra021735 (a bZIP/AIR1 [Anthocyanin-Impaired-Response-1]) were specific to Kenshin. Several candidate TFs involved in thermotolerance were confirmed as HSR genes by real-time PCR, and these assignments were further supported by promoter analysis. Although some of our findings are similar to those obtained using other plant species, clear differences in Brassica rapa reveal a distinct HSR in this species. Our data

  13. Global Regulation of Gene Expression by the MafR Protein of Enterococcus faecalis.

    PubMed

    Ruiz-Cruz, Sofía; Espinosa, Manuel; Goldmann, Oliver; Bravo, Alicia

    2015-01-01

    Enterococcus faecalis is a natural inhabitant of the human gastrointestinal tract. However, as an opportunistic pathogen, it is able to colonize other host niches and cause life-threatening infections. Its adaptation to new environments involves global changes in gene expression. The EF3013 gene (here named mafR) of E. faecalis strain V583 encodes a protein (MafR, 482 residues) that has sequence similarity to global response regulators of the Mga/AtxA family. The enterococcal OG1RF genome also encodes the MafR protein (gene OG1RF_12293). In this work, we have identified the promoter of the mafR gene using several in vivo approaches. Moreover, we show that MafR influences positively the transcription of many genes on a genome-wide scale. The most significant target genes encode components of PTS-type membrane transporters, components of ABC-type membrane transporters, and proteins involved in the metabolism of carbon sources. Some of these genes were previously reported to be up-regulated during the growth of E. faecalis in blood and/or in human urine. Furthermore, we show that a mafR deletion mutant strain induces a significant lower degree of inflammation in the peritoneal cavity of mice, suggesting that enterococcal cells deficient in MafR are less virulent. Our work indicates that MafR is a global transcriptional regulator. It might facilitate the adaptation of E. faecalis to particular host niches and, therefore, contribute to its potential virulence. PMID:26793169

  14. Global Regulation of Gene Expression by the MafR Protein of Enterococcus faecalis

    PubMed Central

    Ruiz-Cruz, Sofía; Espinosa, Manuel; Goldmann, Oliver; Bravo, Alicia

    2016-01-01

    Enterococcus faecalis is a natural inhabitant of the human gastrointestinal tract. However, as an opportunistic pathogen, it is able to colonize other host niches and cause life-threatening infections. Its adaptation to new environments involves global changes in gene expression. The EF3013 gene (here named mafR) of E. faecalis strain V583 encodes a protein (MafR, 482 residues) that has sequence similarity to global response regulators of the Mga/AtxA family. The enterococcal OG1RF genome also encodes the MafR protein (gene OG1RF_12293). In this work, we have identified the promoter of the mafR gene using several in vivo approaches. Moreover, we show that MafR influences positively the transcription of many genes on a genome-wide scale. The most significant target genes encode components of PTS-type membrane transporters, components of ABC-type membrane transporters, and proteins involved in the metabolism of carbon sources. Some of these genes were previously reported to be up-regulated during the growth of E. faecalis in blood and/or in human urine. Furthermore, we show that a mafR deletion mutant strain induces a significant lower degree of inflammation in the peritoneal cavity of mice, suggesting that enterococcal cells deficient in MafR are less virulent. Our work indicates that MafR is a global transcriptional regulator. It might facilitate the adaptation of E. faecalis to particular host niches and, therefore, contribute to its potential virulence. PMID:26793169

  15. Global Gene Expression Profiling in Lung Tissues of Rat Exposed to Lunar Dust Particles

    NASA Technical Reports Server (NTRS)

    Yeshitla, Samrawit A.; Lam, Chiu-Wing; Kidane, Yared H.; Feiveson, Alan H.; Ploutz-Snyder, Robert; Wu, Honglu; James, John T.; Meyers, Valerie E.; Zhang, Ye

    2014-01-01

    The Moon's surface is covered by a layer of fine, potential reactive dust. Lunar dust contain about 1-2% respirable very fine dust (less than 3 micrometers). The habitable area of any lunar landing vehicle and outpost would inevitably be contaminated with lunar dust that could pose a health risk. The purpose of the study is to analyze the dynamics of global gene expression changes in lung tissues of rats exposed to lunar dust particles. F344 rats were exposed for 4 weeks (6h/d; 5d/wk) in nose-only inhalation chambers to concentrations of 0 (control air), 2.1, 6.8, 21, and 61 mg/m3 of lunar dust. Animals were euthanized at 1 day and 13 weeks after the last inhalation exposure. After being lavaged, lung tissue from each animal was collected and total RNA was isolated. Four samples of each dose group were analyzed using Agilent Rat GE v3 microarray to profile global gene expression of 44K transcripts. After background subtraction, normalization, and log transformation, t tests were used to compare the mean expression levels of each exposed group to the control group. Correction for multiple testing was made using the method of Benjamini, Krieger, and Yekuteli (1) to control the false discovery rate. Genes with significant changes of at least 1.75 fold were identified as genes of interest. Both low and high doses of lunar dust caused dramatic, dose-dependent global gene expression changes in the lung tissues. However, the responses of lung tissue to low dose lunar dust are distinguished from those of high doses, especially those associated with 61mg/m3 dust exposure. The data were further integrated into the Ingenuity system to analyze the gene ontology (GO), pathway distribution and putative upstream regulators and gene targets. Multiple pathways, functions, and upstream regulators have been identified in response to lunar dust induced damage in the lung tissue.

  16. Mining Gene Expression Data for Pollutants (Dioxin, Toluene, Formaldehyde) and Low Dose of Gamma-Irradiation

    PubMed Central

    Moskalev, Alexey; Shaposhnikov, Mikhail; Snezhkina, Anastasia; Kogan, Valeria; Plyusnina, Ekaterina; Peregudova, Darya; Melnikova, Nataliya; Uroshlev, Leonid; Mylnikov, Sergey; Dmitriev, Alexey; Plusnin, Sergey; Fedichev, Peter; Kudryavtseva, Anna

    2014-01-01

    General and specific effects of molecular genetic responses to adverse environmental factors are not well understood. This study examines genome-wide gene expression profiles of Drosophila melanogaster in response to ionizing radiation, formaldehyde, toluene, and 2,3,7,8-tetrachlorodibenzo-p-dioxin. We performed RNA-seq analysis on 25,415 transcripts to measure the change in gene expression in males and females separately. An analysis of the genes unique to each treatment yielded a list of genes as a gene expression signature. In the case of radiation exposure, both sexes exhibited a reproducible increase in their expression of the transcription factors sugarbabe and tramtrack. The influence of dioxin up-regulated metabolic genes, such as anachronism, CG16727, and several genes with unknown function. Toluene activated a gene involved in the response to the toxins, Cyp12d1-p; the transcription factor Fer3’s gene; the metabolic genes CG2065, CG30427, and CG34447; and the genes Spn28Da and Spn3, which are responsible for reproduction and immunity. All significantly differentially expressed genes, including those shared among the stressors, can be divided into gene groups using Gene Ontology Biological Process identifiers. These gene groups are related to defense response, biological regulation, the cell cycle, metabolic process, and circadian rhythms. KEGG molecular pathway analysis revealed alteration of the Notch signaling pathway, TGF-beta signaling pathway, proteasome, basal transcription factors, nucleotide excision repair, Jak-STAT signaling pathway, circadian rhythm, Hippo signaling pathway, mTOR signaling pathway, ribosome, mismatch repair, RNA polymerase, mRNA surveillance pathway, Hedgehog signaling pathway, and DNA replication genes. Females and, to a lesser extent, males actively metabolize xenobiotics by the action of cytochrome P450 when under the influence of dioxin and toluene. Finally, in this work we obtained gene expression signatures pollutants

  17. Mining gene expression data for pollutants (dioxin, toluene, formaldehyde) and low dose of gamma-irradiation.

    PubMed

    Moskalev, Alexey; Shaposhnikov, Mikhail; Snezhkina, Anastasia; Kogan, Valeria; Plyusnina, Ekaterina; Peregudova, Darya; Melnikova, Nataliya; Uroshlev, Leonid; Mylnikov, Sergey; Dmitriev, Alexey; Plusnin, Sergey; Fedichev, Peter; Kudryavtseva, Anna

    2014-01-01

    General and specific effects of molecular genetic responses to adverse environmental factors are not well understood. This study examines genome-wide gene expression profiles of Drosophila melanogaster in response to ionizing radiation, formaldehyde, toluene, and 2,3,7,8-tetrachlorodibenzo-p-dioxin. We performed RNA-seq analysis on 25,415 transcripts to measure the change in gene expression in males and females separately. An analysis of the genes unique to each treatment yielded a list of genes as a gene expression signature. In the case of radiation exposure, both sexes exhibited a reproducible increase in their expression of the transcription factors sugarbabe and tramtrack. The influence of dioxin up-regulated metabolic genes, such as anachronism, CG16727, and several genes with unknown function. Toluene activated a gene involved in the response to the toxins, Cyp12d1-p; the transcription factor Fer3's gene; the metabolic genes CG2065, CG30427, and CG34447; and the genes Spn28Da and Spn3, which are responsible for reproduction and immunity. All significantly differentially expressed genes, including those shared among the stressors, can be divided into gene groups using Gene Ontology Biological Process identifiers. These gene groups are related to defense response, biological regulation, the cell cycle, metabolic process, and circadian rhythms. KEGG molecular pathway analysis revealed alteration of the Notch signaling pathway, TGF-beta signaling pathway, proteasome, basal transcription factors, nucleotide excision repair, Jak-STAT signaling pathway, circadian rhythm, Hippo signaling pathway, mTOR signaling pathway, ribosome, mismatch repair, RNA polymerase, mRNA surveillance pathway, Hedgehog signaling pathway, and DNA replication genes. Females and, to a lesser extent, males actively metabolize xenobiotics by the action of cytochrome P450 when under the influence of dioxin and toluene. Finally, in this work we obtained gene expression signatures pollutants

  18. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins

    PubMed Central

    Rouillard, Andrew D.; Gundersen, Gregory W.; Fernandez, Nicolas F.; Wang, Zichen; Monteiro, Caroline D.; McDermott, Michael G.; Ma’ayan, Avi

    2016-01-01

    Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into ∼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene–gene and attribute–attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge

  19. Global Analysis of WRKY Genes and Their Response to Dehydration and Salt Stress in Soybean

    PubMed Central

    Song, Hui; Wang, Pengfei; Hou, Lei; Zhao, Shuzhen; Zhao, Chuanzhi; Xia, Han; Li, Pengcheng; Zhang, Ye; Bian, Xiaotong; Wang, Xingjun

    2016-01-01

    WRKY proteins are plant specific transcription factors involved in various developmental and physiological processes, especially in biotic and abiotic stress resistance. Although previous studies suggested that WRKY proteins in soybean (Glycine max var. Williams 82) involved in both abiotic and biotic stress responses, the global information of WRKY proteins in the latest version of soybean genome (Wm82.a2v1) and their response to dehydration and salt stress have not been reported. In this study, we identified 176 GmWRKY proteins from soybean Wm82.a2v1 genome. These proteins could be classified into three groups, namely group I (32 proteins), group II (120 proteins), and group III (24 proteins). Our results showed that most GmWRKY genes were located on Chromosome 6, while chromosome 11, 12, and 20 contained the least number of this gene family. More GmWRKY genes were distributed on the ends of chromosomes to compare with other regions. The cis-acting elements analysis suggested that GmWRKY genes were transcriptionally regulated upon dehydration and salt stress. RNA-seq data analysis indicated that three GmWRKY genes responded negatively to dehydration, and 12 genes positively responded to salt stress at 1, 6, and 12 h, respectively. We confirmed by qRT-PCR that the expression of GmWRKY47 and GmWRKY 58 genes was decreased upon dehydration, and the expression of GmWRKY92, 144 and 165 genes was increased under salt treatment. PMID:26870047

  20. Mining Disease-Resistance Genes in Roses: Functional and Molecular Characterization of the Rdr1 Locus

    PubMed Central

    Terefe-Ayana, Diro; Yasmin, Aneela; Le, Thanh Loan; Kaufmann, Helgard; Biber, Anja; Kühr, Astrid; Linde, Marcus; Debener, Thomas

    2011-01-01

    The interaction of roses with the leaf spot pathogen Diplocarpon rosae (the cause of black spot on roses) is an interesting pathosystem because it involves a long-lived woody perennial, with life history traits very different from most model plants, and a hemibiotrophic pathogen with moderate levels of gene flow. Here we present data on the molecular structure of the first monogenic dominant resistance gene from roses, Rdr1, directed against one isolate of D. rosae. Complete sequencing of the locus carrying the Rdr1 gene resulted in a sequence of 265,477 bp with a cluster of nine highly related TIR–NBS–LRR (TNL) candidate genes. After sequencing revealed candidate genes for Rdr1, we implemented a gene expression analysis and selected five genes out of the nine TNLs. We then silenced the whole TNL gene family using RNAi (Rdr1–RNAi) constructed from the most conserved sequence region and demonstrated a loss of resistance in the normally resistant genotype. To identify the functional TNL gene, we further screened the five TNL candidate genes with a transient leaf infiltration assay. The transient expression assay indicated a single TNL gene (muRdr1H), partially restoring resistance in the susceptible genotype. Rdr1 was found to localize within the muRdr1 gene family; the genes within this locus contain characteristic motifs of active TNL genes and belong to a young cluster of R genes. The transient leaf assay can be used to further analyze the rose black spot interaction and its evolution, extending the analyses to additional R genes and to additional pathogenic types of the pathogen. PMID:22639591

  1. Global adaptive rank truncated product method for gene-set analysis in association studies.

    PubMed

    Vilor-Tejedor, Natalia; Calle, M Luz

    2014-08-01

    Gene set analysis (GSA) aims to assess the overall association of a set of genetic variants with a phenotype and has the potential to detect subtle effects of variants in a gene or a pathway that might be missed when assessed individually. We present a new implementation of the Adaptive Rank Truncated Product method (ARTP) for analyzing the association of a set of Single Nucleotide Polymorphisms (SNPs) in a gene or pathway. The new implementation, referred to as globalARTP, improves the original one by allowing the different SNPs in the set to have different modes of inheritance. We perform a simulation study for exploring the power of the proposed methodology in a set of scenarios with different numbers of causal SNPs with different effect sizes. Moreover, we show the advantage of using the gene set approach in the context of an Alzheimer's disease case-control study where we explore the endocytosis pathway. The new method is implemented in the R function globalARTP of the globalGSA package available at http://cran.r-project.org. PMID:25082012

  2. Expression Analysis of Ni- and V-Associated Resistance Genes in a Bacillus megaterium Strain Isolated from a Mining Site.

    PubMed

    Fierros Romero, Grisel; Rivas Castillo, Andrea; Gómez Ramírez, Marlenne; Pless, Reynaldo; Rojas Avelizapa, Norma

    2016-08-01

    Bacillus megaterium strain MNSH1-9K-1 was isolated from a mining site in Guanajuato, Mexico. This B. megaterium strain presented the ability to remove Ni and V from a spent catalyst. Also, its associated metal resistance genes nccA, hant, VAN2, and smtAB were previously identified by a PCR approach. The present study reports for the first time, in B. megaterium, the changes in the expression of the genes nccA (Ni-Co-Cd resistance); hant (high-affinity nickel transporter); smtAB, a metal-binding protein gene; and VAN2 (V resistance) after exposure to 200 ppm of Ni and 200 ppm of V during the stationary phase of the microorganism in PHGII liquid media. The data presented here may contribute to the knowledge of the genes involved in the Ni and V resistances of B. megaterium, and the possible pathways implicated in the Ni-V removal processes, which may be potentiated for the biological treatment of high metal content residues. PMID:27107759

  3. SSH gene expression profile of Eisenia andrei exposed in situ to a naturally contaminated soil from an abandoned uranium mine.

    PubMed

    Lourenço, Joana; Pereira, Ruth; Gonçalves, Fernando; Mendo, Sónia

    2013-02-01

    The effects of the exposure of earthworms (Eisenia andrei) to contaminated soil from an abandoned uranium mine, were assessed through gene expression profile evaluation by Suppression Subtractive Hybridization (SSH). Organisms were exposed in situ for 56 days, in containers placed both in a contaminated and in a non-contaminated site (reference). Organisms were sampled after 14 and 56 days of exposure. Results showed that the main physiological functions affected by the exposure to metals and radionuclides were: metabolism, oxireductase activity, redox homeostasis and response to chemical stimulus and stress. The relative expression of NADH dehydrogenase subunit 1 and elongation factor 1 alpha was also affected, since the genes encoding these enzymes were significantly up and down-regulated, after 14 and 56 days of exposure, respectively. Also, an EST with homology for SET oncogene was found to be up-regulated. To the best of our knowledge, this is the first time that this gene was identified in earthworms and thus, further studies are required, to clarify its involvement in the toxicity of metals and radionuclides. Considering the results herein presented, gene expression profiling proved to be a very useful tool to detect earthworms underlying responses to metals and radionuclides exposure, pointing out for the detection and development of potential new biomarkers. PMID:23164450

  4. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

    PubMed

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H

    2015-07-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. PMID:25948579

  5. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

    PubMed Central

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A.; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H.

    2015-01-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. PMID:25948579

  6. Global Analysis of the Human Pathophenotypic Similarity Gene Network Merges Disease Module Components

    PubMed Central

    Reyes-Palomares, Armando; Rodríguez-López, Rocío; Ranea, Juan A. G.; Jiménez, Francisca Sánchez; Medina, Miguel Angel

    2013-01-01

    The molecular complexity of genetic diseases requires novel approaches to break it down into coherent biological modules. For this purpose, many disease network models have been created and analyzed. We highlight two of them, “the human diseases networks” (HDN) and “the orphan disease networks” (ODN). However, in these models, each single node represents one disease or an ambiguous group of diseases. In these cases, the notion of diseases as unique entities reduces the usefulness of network-based methods. We hypothesize that using the clinical features (pathophenotypes) to define pathophenotypic connections between disease-causing genes improve our understanding of the molecular events originated by genetic disturbances. For this, we have built a pathophenotypic similarity gene network (PSGN) and compared it with the unipartite projections (based on gene-to-gene edges) similar to those used in previous network models (HDN and ODN). Unlike these disease network models, the PSGN uses semantic similarities. This pathophenotypic similarity has been calculated by comparing pathophenotypic annotations of genes (human abnormalities of HPO terms) in the “Human Phenotype Ontology”. The resulting network contains 1075 genes (nodes) and 26197 significant pathophenotypic similarities (edges). A global analysis of this network reveals: unnoticed pairs of genes showing significant pathophenotypic similarity, a biological meaningful re-arrangement of the pathological relationships between genes, correlations of biochemical interactions with higher similarity scores and functional biases in metabolic and essential genes toward the pathophenotypic specificity and the pleiotropy, respectively. Additionally, pathophenotypic similarities and metabolic interactions of genes associated with maple syrup urine disease (MSUD) have been used to merge into a coherent pathological module. Our results indicate that pathophenotypes contribute to identify underlying co

  7. Global assessment of imprinted gene expression in the bovine conceptus by next generation sequencing

    PubMed Central

    Chen, Zhiyuan; Hagen, Darren E.; Wang, Juanbin; Elsik, Christine G.; Ji, Tieming; Siqueira, Luiz G.; Hansen, Peter J.; Rivera, Rocío M.

    2016-01-01

    ABSTRACT Genomic imprinting is an epigenetic mechanism that leads to parental-allele-specific gene expression. Approximately 150 imprinted genes have been identified in humans and mice but less than 30 have been described as imprinted in cattle. For the purpose of de novo identification of imprinted genes in bovine, we determined global monoallelic gene expression in brain, skeletal muscle, liver, kidney and placenta of day ∼105 Bos taurus indicus × Bos taurus taurus F1 conceptuses using RNA sequencing. To accomplish this, we developed a bioinformatics pipeline to identify parent-specific single nucleotide polymorphism alleles after filtering adenosine to inosine (A-to-I) RNA editing sites. We identified 53 genes subject to monoallelic expression. Twenty three are genes known to be imprinted in the cow and an additional 7 have previously been characterized as imprinted in human and/or mouse that have not been reported as imprinted in cattle. Of the remaining 23 genes, we found that 10 are uncharacterized or unannotated transcripts located in known imprinted clusters, whereas the other 13 genes are distributed throughout the bovine genome and are not close to any known imprinted clusters. To exclude potential cis-eQTL effects on allele expression, we corroborated the parental specificity of monoallelic expression in day 86 Bos taurus taurus × Bos taurus taurus conceptuses and identified 8 novel bovine imprinted genes. Further, we identified 671 candidate A-to-I RNA editing sites and describe random X-inactivation in day 15 bovine extraembryonic membranes. Our results expand the imprinted gene list in bovine and demonstrate that monoallelic gene expression can be the result of cis-eQTL effects. PMID:27245094

  8. Global analysis of the human pathophenotypic similarity gene network merges disease module components.

    PubMed

    Reyes-Palomares, Armando; Rodríguez-López, Rocío; Ranea, Juan A G; Sánchez-Jiménez, Francisca; Sánchez Jiménez, Francisca; Medina, Miguel Angel

    2013-01-01

    The molecular complexity of genetic diseases requires novel approaches to break it down into coherent biological modules. For this purpose, many disease network models have been created and analyzed. We highlight two of them, "the human diseases networks" (HDN) and "the orphan disease networks" (ODN). However, in these models, each single node represents one disease or an ambiguous group of diseases. In these cases, the notion of diseases as unique entities reduces the usefulness of network-based methods. We hypothesize that using the clinical features (pathophenotypes) to define pathophenotypic connections between disease-causing genes improve our understanding of the molecular events originated by genetic disturbances. For this, we have built a pathophenotypic similarity gene network (PSGN) and compared it with the unipartite projections (based on gene-to-gene edges) similar to those used in previous network models (HDN and ODN). Unlike these disease network models, the PSGN uses semantic similarities. This pathophenotypic similarity has been calculated by comparing pathophenotypic annotations of genes (human abnormalities of HPO terms) in the "Human Phenotype Ontology". The resulting network contains 1075 genes (nodes) and 26197 significant pathophenotypic similarities (edges). A global analysis of this network reveals: unnoticed pairs of genes showing significant pathophenotypic similarity, a biological meaningful re-arrangement of the pathological relationships between genes, correlations of biochemical interactions with higher similarity scores and functional biases in metabolic and essential genes toward the pathophenotypic specificity and the pleiotropy, respectively. Additionally, pathophenotypic similarities and metabolic interactions of genes associated with maple syrup urine disease (MSUD) have been used to merge into a coherent pathological module.Our results indicate that pathophenotypes contribute to identify underlying co-dependencies among disease

  9. Global assessment of imprinted gene expression in the bovine conceptus by next generation sequencing.

    PubMed

    Chen, Zhiyuan; Hagen, Darren E; Wang, Juanbin; Elsik, Christine G; Ji, Tieming; Siqueira, Luiz G; Hansen, Peter J; Rivera, Rocío M

    2016-07-01

    Genomic imprinting is an epigenetic mechanism that leads to parental-allele-specific gene expression. Approximately 150 imprinted genes have been identified in humans and mice but less than 30 have been described as imprinted in cattle. For the purpose of de novo identification of imprinted genes in bovine, we determined global monoallelic gene expression in brain, skeletal muscle, liver, kidney and placenta of day ∼105 Bos taurus indicus × Bos taurus taurus F1 conceptuses using RNA sequencing. To accomplish this, we developed a bioinformatics pipeline to identify parent-specific single nucleotide polymorphism alleles after filtering adenosine to inosine (A-to-I) RNA editing sites. We identified 53 genes subject to monoallelic expression. Twenty three are genes known to be imprinted in the cow and an additional 7 have previously been characterized as imprinted in human and/or mouse that have not been reported as imprinted in cattle. Of the remaining 23 genes, we found that 10 are uncharacterized or unannotated transcripts located in known imprinted clusters, whereas the other 13 genes are distributed throughout the bovine genome and are not close to any known imprinted clusters. To exclude potential cis-eQTL effects on allele expression, we corroborated the parental specificity of monoallelic expression in day 86 Bos taurus taurus × Bos taurus taurus conceptuses and identified 8 novel bovine imprinted genes. Further, we identified 671 candidate A-to-I RNA editing sites and describe random X-inactivation in day 15 bovine extraembryonic membranes. Our results expand the imprinted gene list in bovine and demonstrate that monoallelic gene expression can be the result of cis-eQTL effects. PMID:27245094

  10. Differentiation in neutral genes and a candidate gene in the pied flycatcher: using biological archives to track global climate change

    PubMed Central

    Kuhn, Kerstin; Schwenk, Klaus; Both, Christiaan; Canal, David; Johansson, Ulf S; van der Mije, Steven; Töpfer, Till; Päckert, Martin

    2013-01-01

    Global climate change is one of the major driving forces for adaptive shifts in migration and breeding phenology and possibly impacts demographic changes if a species fails to adapt sufficiently. In Western Europe, pied flycatchers (Ficedula hypoleuca) have insufficiently adapted their breeding phenology to the ongoing advance of food peaks within their breeding area and consequently suffered local population declines. We address the question whether this population decline led to a loss of genetic variation, using two neutral marker sets (mitochondrial control region and microsatellites), and one potentially selectively non-neutral marker (avian Clock gene). We report temporal changes in genetic diversity in extant populations and biological archives over more than a century, using samples from sites differing in the extent of climate change. Comparing genetic differentiation over this period revealed that only the recent Dutch population, which underwent population declines, showed slightly lower genetic variation than the historic Dutch population. As that loss of variation was only moderate and not observed in all markers, current gene flow across Western and Central European populations might have compensated local loss of variation over the last decades. A comparison of genetic differentiation in neutral loci versus the Clock gene locus provided evidence for stabilizing selection. Furthermore, in all genetic markers, we found a greater genetic differentiation in space than in time. This pattern suggests that local adaptation or historic processes might have a stronger effect on the population structure and genetic variation in the pied flycatcher than recent global climate changes. PMID:24363905

  11. Global Pattern of Gene Expression of Xanthomonas axonopodis pv. glycines Within Soybean Leaves.

    PubMed

    Chatnaparat, Tiyakhon; Prathuangwong, Sutruedee; Lindow, Steven E

    2016-06-01

    To better understand the behavior of Xanthomonas axonopodis pv. glycines, the causal agent of bacterial pustule of soybean within its host, its global transcriptome within soybean leaves was compared with that in a minimal medium in vitro, using deep sequencing of mRNA. Of 5,062 genes predicted from a draft genome of X. axonopodis pv. glycines, 534 were up-regulated in the plant, while 289 were down-regulated. Genes encoding YapH, a cell-surface adhesin, as well as several others encoding cell-surface proteins, were down-regulated in soybean. Many genes encoding the type III secretion system and effector proteins, cell wall-degrading enzymes and phosphate transporter proteins were strongly expressed at early stages of infection. Several genes encoding RND multidrug efflux pumps were induced in planta and by isoflavonoids in vitro and were required for full virulence of X. axonopodis pv. glycines, as well as resistance to soybean phytoalexins. Genes encoding consumption of malonate, a compound abundant in soybean, were induced in planta and by malonate in vitro. Disruption of the malonate decarboxylase operon blocked growth in minimal media with malonate as the sole carbon source but did not significantly alter growth in soybean, apparently because genes for sucrose and fructose uptake were also induced in planta. Many genes involved in phosphate metabolism and uptake were induced in planta. While disruption of genes encoding high-affinity phosphate transport did not alter growth in media varying in phosphate concentration, the mutants were severely attenuated for growth in soybean. This global transcriptional profiling has provided insight into both the intercellular environment of this soybean pathogen and traits used by X. axonopodis pv. glycines to promote disease. PMID:27003800

  12. Linking genes to literature: text mining, information extraction, and retrieval applications for biology

    PubMed Central

    Krallinger, Martin; Valencia, Alfonso; Hirschman, Lynette

    2008-01-01

    Efficient access to information contained in online scientific literature collections is essential for life science research, playing a crucial role from the initial stage of experiment planning to the final interpretation and communication of the results. The biological literature also constitutes the main information source for manual literature curation used by expert-curated databases. Following the increasing popularity of web-based applications for analyzing biological data, new text-mining and information extraction strategies are being implemented. These systems exploit existing regularities in natural language to extract biologically relevant information from electronic texts automatically. The aim of the BioCreative challenge is to promote the development of such tools and to provide insight into their performance. This review presents a general introduction to the main characteristics and applications of currently available text-mining systems for life sciences in terms of the following: the type of biological information demands being addressed; the level of information granularity of both user queries and results; and the features and methods commonly exploited by these applications. The current trend in biomedical text mining points toward an increasing diversification in terms of application types and techniques, together with integration of domain-specific resources such as ontologies. Additional descriptions of some of the systems discussed here are available on the internet . PMID:18834499

  13. Global gene expression changes in BV2 microglial cell line during rabies virus infection.

    PubMed

    Zhao, Pingsen; Yang, Yujiao; Feng, Hao; Zhao, Lili; Qin, Junling; Zhang, Tao; Wang, Hualei; Yang, Songtao; Xia, Xianzhu

    2013-12-01

    Microglia plays a crucial role during virus pathogenesis in the central nervous system (CNS). Infection by rabies virus (RABV) causes a fatal infection in the CNS of all warm-blooded animals. However, the microglial responses to RABV infection have been scarcely reported. To better understand microglia-RABV interactions at the transcriptional level, a genome wide gene expression profile in mouse microglial cells line BV2 was performed using microarray analysis. The global messenger RNA changes in murine microglial cell line BV2 after 12, 24 and 48 h of infection with rabies virus CVS-11 strain were investigated using DNA Microarray and quantitative real-time PCR. Infection of CVS-11 at different time points induced different gene expression signatures in BV2 cells. The expression patterns of differentially expressed genes are shown by K-means clustering in four clusters in RABV- or mock-infected microglia at 12, 24 and 48h post infection (hpi). Gene ontology and network analysis of the differentially expressed genes in responses to RABV were performed by the Ingenuity Pathway Analysis system (IPA, Ingenuity® Systems, http://www.ingenuity.com). The results revealed that 28 genes were significantly up-regulated (P<0.01) and 1 gene was significantly down-regulated (P<0.01) in microglial cells at 12hpi, 72 genes were significantly up-regulated (P<0.01) and 24 genes were significantly down-regulated (P<0.01) at 24hpi, and 671 genes were significantly up-regulated (P<0.01) and 190 genes were significantly down-regulated (P<0.01) at 48hpi. Genes in BV2 were significantly regulated (P<0.01) in response to RABV infection and they were found to be interferon stimulated genes (Isg15, Isg20, Oasl1, Oasl2, Ifit2, Irf7 and Ifi203), chemokine genes (Ccl5, Cxcl10 and Ccrl2) and the proinflammatory factor gene (Interleukin 6). The results indicated that the differentially expressed genes from microglial cells after RABV infection were mainly involved in innate immune responses

  14. De Novo Evolution of Complex, Global and Hierarchical Gene Regulatory Mechanisms

    PubMed Central

    Jenkins, Dafyd J.

    2010-01-01

    Gene regulatory networks exhibit complex, hierarchical features such as global regulation and network motifs. There is much debate about whether the evolutionary origins of such features are the results of adaptation, or the by-products of non-adaptive processes of DNA replication. The lack of availability of gene regulatory networks of ancestor species on evolutionary timescales makes this a particularly difficult problem to resolve. Digital organisms, however, can be used to provide a complete evolutionary record of lineages. We use a biologically realistic evolutionary model that includes gene expression, regulation, metabolism and biosynthesis, to investigate the evolution of complex function in gene regulatory networks. We discover that: (i) network architecture and complexity evolve in response to environmental complexity, (ii) global gene regulation is selected for in complex environments, (iii) complex, inter-connected, hierarchical structures evolve in stages, with energy regulation preceding stress responses, and stress responses preceding growth rate adaptations and (iv) robustness of evolved models to mutations depends on hierarchical level: energy regulation and stress responses tend not to be robust to mutations, whereas growth rate adaptations are more robust and non-lethal when mutated. These results highlight the adaptive and incremental evolution of complex biological networks, and the value and potential of studying realistic in silico evolutionary systems as a way of understanding living systems. Electronic supplementary material The online version of this article (doi:10.1007/s00239-010-9369-4) contains supplementary material, which is available to authorized users. PMID:20680619

  15. Global Developmental Gene Programing Involves a Nuclear Form of Fibroblast Growth Factor Receptor-1 (FGFR1).

    PubMed

    Terranova, Christopher; Narla, Sridhar T; Lee, Yu-Wei; Bard, Jonathan; Parikh, Abhirath; Stachowiak, Ewa K; Tzanakakis, Emmanuel S; Buck, Michael J; Birkaya, Barbara; Stachowiak, Michal K

    2015-01-01

    Genetic studies have placed the Fgfr1 gene at the top of major ontogenic pathways that enable gastrulation, tissue development and organogenesis. Using genome-wide sequencing and loss and gain of function experiments the present investigation reveals a mechanism that underlies global and direct gene regulation by the nuclear form of FGFR1, ensuring that pluripotent Embryonic Stem Cells differentiate into Neuronal Cells in response to Retinoic Acid. Nuclear FGFR1, both alone and with its partner nuclear receptors RXR and Nur77, targets thousands of active genes and controls the expression of pluripotency, homeobox, neuronal and mesodermal genes. Nuclear FGFR1 targets genes in developmental pathways represented by Wnt/β-catenin, CREB, BMP, the cell cycle and cancer-related TP53 pathway, neuroectodermal and mesodermal programing networks, axonal growth and synaptic plasticity pathways. Nuclear FGFR1 targets the consensus sequences of transcription factors known to engage CREB-binding protein, a common coregulator of transcription and established binding partner of nuclear FGFR1. This investigation reveals the role of nuclear FGFR1 as a global genomic programmer of cell, neural and muscle development. PMID:25923916

  16. Mining microbial metatranscriptomes for expression of antibiotic resistance genes under natural conditions

    NASA Astrophysics Data System (ADS)

    Versluis, Dennis; D'Andrea, Marco Maria; Ramiro Garcia, Javier; Leimena, Milkha M.; Hugenholtz, Floor; Zhang, Jing; Öztürk, Başak; Nylund, Lotta; Sipkema, Detmer; Schaik, Willem Van; de Vos, Willem M.; Kleerebezem, Michiel; Smidt, Hauke; Passel, Mark W. J. Van

    2015-07-01

    Antibiotic resistance genes are found in a broad range of ecological niches associated with complex microbiota. Here we investigated if resistance genes are not only present, but also transcribed under natural conditions. Furthermore, we examined the potential for antibiotic production by assessing the expression of associated secondary metabolite biosynthesis gene clusters. Metatranscriptome datasets from intestinal microbiota of four human adults, one human infant, 15 mice and six pigs, of which only the latter have received antibiotics prior to the study, as well as from sea bacterioplankton, a marine sponge, forest soil and sub-seafloor sediment, were investigated. We found that resistance genes are expressed in all studied ecological niches, albeit with niche-specific differences in relative expression levels and diversity of transcripts. For example, in mice and human infant microbiota predominantly tetracycline resistance genes were expressed while in human adult microbiota the spectrum of expressed genes was more diverse, and also included β-lactam, aminoglycoside and macrolide resistance genes. Resistance gene expression could result from the presence of natural antibiotics in the environment, although we could not link it to expression of corresponding secondary metabolites biosynthesis clusters. Alternatively, resistance gene expression could be constitutive, or these genes serve alternative roles besides antibiotic resistance.

  17. Data Mining in Networks of Differentially Expressed Genes during Sow Pregnancy

    PubMed Central

    Wang, Ligang; Zhang, Longchao; Li, Yong; Li, Wen; Luo, Weizhen; Cheng, Duxue; Yan, Hua; Ma, Xiaojun; Liu, Xin; Song, Xin; Liang, Jing; Zhao, Kebin; Wang, Lixian

    2012-01-01

    Small to moderate gains in Pig fertility can mean large returns in overall efficiency, and developing methods to improve it is highly desirable. High fertility rates depend on completion of successful pregnancies. To understand the molecular signals associated with pregnancy in sows, expression profiling experiments were conducted to identify differentially expressed genes in ovary and myometrium at different pregnancy periods using the Affymetrix Porcine GeneChipTM. A total of 974, 1800, 335 and 710 differentially expressed transcripts were identified in the myometrium during early pregnancy (EP) and late pregnancy (LP), and in the ovary during EP and LP, respectively. Self-Organizing Map (SOM) clusters indicated the differentially expressed genes belonged to 7 different functional groups. Based on BLASTX searches and Gene Ontology (GO) classifications, 129 unique genes closely related to pregnancy showed differential expression patterns. GO analysis also indicated that there were 21 different molecular function categories, 20 different biological process categories, and 8 different cellular component categories of genes differentially expressed during sow pregnancy. Gene regulatory network reconstruction provided us with an interaction model of known genes such as insulin-like growth factor 2 (IGF2) gene, estrogen receptor (ESR) gene, retinol-binding protein-4 (RBP4) gene, and several unknown candidate genes related to reproduction. Several pitch point genes were selected for association study with reproduction traits. For instance, DPPA5 g.363 T>C was found to associate with litter born weight at later parities in Beijing Black pigs significantly (p < 0.05). Overall, this study contributes to elucidating the mechanism underlying pregnancy processes, which maybe provide valuable information for pig reproduction improvement. PMID:22532788

  18. Function Clustering Self-Organization Maps (FCSOMs) for mining differentially expressed genes in Drosophila and its correlation with the growth medium.

    PubMed

    Liu, L L; Liu, M J; Ma, M

    2015-01-01

    The central task of this study was to mine the gene-to-medium relationship. Adequate knowledge of this relationship could potentially improve the accuracy of differentially expressed gene mining. One of the approaches to differentially expressed gene mining uses conventional clustering algorithms to identify the gene-to-medium relationship. Compared to conventional clustering algorithms, self-organization maps (SOMs) identify the nonlinear aspects of the gene-to-medium relationships by mapping the input space into another higher dimensional feature space. However, SOMs are not suitable for huge datasets consisting of millions of samples. Therefore, a new computational model, the Function Clustering Self-Organization Maps (FCSOMs), was developed. FCSOMs take advantage of the theory of granular computing as well as advanced statistical learning methodologies, and are built specifically for each information granule (a function cluster of genes), which are intelligently partitioned by the clustering algorithm provided by the DAVID_6.7 software platform. However, only the gene functions, and not their expression values, are considered in the fuzzy clustering algorithm of DAVID. Compared to the clustering algorithm of DAVID, these experimental results show a marked improvement in the accuracy of classification with the application of FCSOMs. FCSOMs can handle huge datasets and their complex classification problems, as each FCSOM (modeled for each function cluster) can be easily parallelized. PMID:26436407

  19. Constraint and divergence of global gene expression in the mammalian embryo

    PubMed Central

    Spies, Noah; Smith, Cheryl L; Rodriguez, Jesse M; Baker, Julie C; Batzoglou, Serafim; Sidow, Arend

    2015-01-01

    The effects of genetic variation on gene regulation in the developing mammalian embryo remain largely unexplored. To globally quantify these effects, we crossed two divergent mouse strains and asked how genotype of the mother or of the embryo drives gene expression phenotype genomewide. Embryonic expression of 331 genes depends on the genotype of the mother. Embryonic genotype controls allele-specific expression of 1594 genes and a highly overlapping set of cis-expression quantitative trait loci (eQTL). A marked paucity of trans-eQTL suggests that the widespread expression differences do not propagate through the embryonic gene regulatory network. The cis-eQTL genes exhibit lower-than-average evolutionary conservation and are depleted for developmental regulators, consistent with purifying selection acting on expression phenotype of pattern formation genes. The widespread effect of maternal and embryonic genotype in conjunction with the purifying selection we uncovered suggests that embryogenesis is an important and understudied reservoir of phenotypic variation. DOI: http://dx.doi.org/10.7554/eLife.05538.001 PMID:25871848

  20. Global Expression Profiling of Low Temperature Induced Genes in the Chilling Tolerant Japonica Rice Jumli Marshi

    PubMed Central

    Chawade, Aakash; Lindlöf, Angelica; Olsson, Björn; Olsson, Olof

    2013-01-01

    Low temperature is a key factor that limits growth and productivity of many important agronomical crops worldwide. Rice (Oryza sativa L.) is negatively affected already at temperatures below +10°C and is therefore denoted as chilling sensitive. However, chilling tolerant rice cultivars exist and can be commercially cultivated at altitudes up to 3,050 meters with temperatures reaching as low as +4°C. In this work, the global transcriptional response to cold stress (+4°C) was studied in the Nepalese highland variety Jumli Marshi (spp. japonica) and 4,636 genes were identified as significantly differentially expressed within 24 hours of cold stress. Comparison with previously published microarray data from one chilling tolerant and two sensitive rice cultivars identified 182 genes differentially expressed (DE) upon cold stress in all four rice cultivars and 511 genes DE only in the chilling tolerant rice. Promoter analysis of the 182 genes suggests a complex cross-talk between ABRE and CBF regulons. Promoter analysis of the 511 genes identified over-represented ABRE motifs but not DRE motifs, suggesting a role for ABA signaling in cold tolerance. Moreover, 2,101 genes were DE in Jumli Marshi alone. By chromosomal localization analysis, 473 of these cold responsive genes were located within 13 different QTLs previously identified as cold associated. PMID:24349120

  1. The global gene expression profile of the secondary transition during pancreatic development.

    PubMed

    Willmann, Stefanie J; Mueller, Nikola S; Engert, Silvia; Sterr, Michael; Burtscher, Ingo; Raducanu, Aurelia; Irmler, Martin; Beckers, Johannes; Sass, Steffen; Theis, Fabian J; Lickert, Heiko

    2016-02-01

    Pancreas organogenesis is a highly dynamic process where neighboring tissue interactions lead to dynamic changes in gene regulatory networks that orchestrate endocrine, exocrine, and ductal lineage formation. To understand the spatio-temporal regulatory logic we have used the Forkhead transcription factor Foxa2-Venus fusion (FVF) knock-in reporter mouse to separate the FVF(+) pancreatic epithelium from the FVF(−) surrounding tissue (mesenchyme, neurons, blood, and blood vessels) to perform a genome-wide mRNA expression profiling at embryonic days (E) 12.5-15.5. Annotating genes and molecular processes suggest that FVF marks endoderm-derived multipotent epithelial progenitors at several lineage restriction steps, when the bulk of endocrine, exocrine and ductal cells are formed during the secondary transition. In the pancreatic epithelial compartment, we identified most known endocrine and exocrine lineage determining factors and diabetes-associated genes, but also unknown genes with spatio-temporal regulated pancreatic expression. In the non-endoderm-derived compartment, we identified many well-described regulatory genes that are not yet functionally annotated in pancreas development, emphasizing that neighboring tissue interactions are still ill defined. Pancreatic expression of over 635 genes was analyzed with them RNA in situ hybridization Genepaint public database. This validated the quality of the profiling data set and identified hundreds of genes with spatially restricted expression patterns in the pancreas. Some of these genes are also targeted by pancreatic transcription factors and show active chromatin marks in human islets of Langerhans. Thus, with the highest spatio-temporal resolution of a global gene expression profile during the secondary transition, our study enables to shed light on neighboring tissue interactions, developmental timing and diabetes gene regulation. PMID:26643664

  2. Global irradiation effects, stem cell genes and rare transcripts in the planarian transcriptome.

    PubMed

    Galloni, Mireille

    2012-01-01

    Stem cells are the closest relatives of the totipotent primordial cell, which is able to spawn millions of daughter cells and hundreds of cell types in multicellular organisms. Stem cells are involved in tissue homeostasis and regeneration, and may play a major role in cancer development. Among animals, planarians host a model stem cell type, called the neoblast, which essentially confers immortality. Gaining insights into the global transcriptional landscape of these exceptional cells takes an unprecedented turn with the advent of Next Generation Sequencing methods. Two Digital Gene Expression transcriptomes of Schmidtea mediterranea planarians, with or without neoblasts lost through irradiation, were produced and analyzed. Twenty one bp NlaIII tags were mapped to transcripts in the Schmidtea and Dugesia taxids. Differential representation of tags in normal versus irradiated animals reflects differential gene expression. Canonical and non-canonical tags were included in the analysis, and comparative studies with human orthologs were conducted. Transcripts fell into 3 categories: invariant (including housekeeping genes), absent in irradiated animals (potential neoblast-specific genes, IRDOWN) and induced in irradiated animals (potential cellular stress response, IRUP). Different mRNA variants and gene family members were recovered. In the IR-DOWN class, almost all of the neoblast-specific genes previously described were found. In irradiated animals, a larger number of genes were induced rather than lost. A significant fraction of IRUP genes behaved as if transcript versions of different lengths were produced. Several novel potential neoblast-specific genes have been identified that varied in relative abundance, including highly conserved as well as novel proteins without predicted orthologs. Evidence for a large body of antisense transcripts, for example regulated antisense for the Smed-piwil1 gene, and evidence for RNA shortening in irradiated animals is presented

  3. Potential impact of human mitochondrial replacement on global policy regarding germline gene modification.

    PubMed

    Ishii, Tetsuya

    2014-08-01

    Previous discussions regarding human germline gene modification led to a global consensus that no germline should undergo genetic modification. However, the UK Human Fertilisation and Embryology Authority, having conducted at the UK Government's request a scientific review and a wide public consultation, provided advice to the Government on the pros and cons of Parliament's lifting a ban on altering mitochondrial DNA content of human oocytes and embryos, so as to permit the prevention of maternal transmission of mitochondrial diseases. In this commentary, relevant ethical and biomedical issues are examined and requirements for proceeding with this novel procedure are suggested. Additionally, potentially significant impacts of the UK legalization on global policy concerning germline gene modification are discussed in the context of recent advances in genome-editing technology. It is concluded that international harmonization is needed, as well as further ethical and practical consideration, prior to the legalization of human mitochondrial replacement. PMID:24832374

  4. The BET protein FSH functionally interacts with ASH1 to orchestrate global gene activity in Drosophila

    PubMed Central

    2013-01-01

    Background The question of how cells re-establish gene expression states after cell division is still poorly understood. Genetic and molecular analyses have indicated that Trithorax group (TrxG) proteins are critical for the long-term maintenance of active gene expression states in many organisms. A generally accepted model suggests that TrxG proteins contribute to maintenance of transcription by protecting genes from inappropriate Polycomb group (PcG)-mediated silencing, instead of directly promoting transcription. Results and discussion Here we report a physical and functional interaction in Drosophila between two members of the TrxG, the histone methyltransferase ASH1 and the bromodomain and extraterminal family protein FSH. We investigated this interface at the genome level, uncovering a widespread co-localization of both proteins at promoters and PcG-bound intergenic elements. Our integrative analysis of chromatin maps and gene expression profiles revealed that the observed ASH1-FSH binding pattern at promoters is a hallmark of active genes. Inhibition of FSH-binding to chromatin resulted in global down-regulation of transcription. In addition, we found that genes displaying marks of robust PcG-mediated repression also have ASH1 and FSH bound to their promoters. Conclusions Our data strongly favor a global coactivator function of ASH1 and FSH during transcription, as opposed to the notion that TrxG proteins impede inappropriate PcG-mediated silencing, but are dispensable elsewhere. Instead, our results suggest that PcG repression needs to overcome the transcription-promoting function of ASH1 and FSH in order to silence genes. PMID:23442797

  5. Phosphorylation events in the multiple gene regulator of group A Streptococcus significantly influence global gene expression and virulence.

    PubMed

    Sanson, Misu; Makthal, Nishanth; Gavagan, Maire; Cantu, Concepcion; Olsen, Randall J; Musser, James M; Kumaraswami, Muthiah

    2015-06-01

    Whole-genome sequencing analysis of ∼800 strains of group A Streptococcus (GAS) found that the gene encoding the multiple virulence gene regulator of GAS (mga) is highly polymorphic in serotype M59 strains but not in strains of other serotypes. To help understand the molecular mechanism of gene regulation by Mga and its contribution to GAS pathogenesis in serotype M59 GAS, we constructed an isogenic mga mutant strain. Transcriptome studies indicated a significant regulatory influence of Mga and altered metabolic capabilities conferred by Mga-regulated genes. We assessed the phosphorylation status of Mga in GAS cell lysates with Phos-tag gels. The results revealed that Mga is phosphorylated at histidines in vivo. Using phosphomimetic and nonphosphomimetic substitutions at conserved phosphoenolpyruvate:carbohydrate phosphotransferase regulation domain (PRD) histidines of Mga, we demonstrated that phosphorylation-mimicking aspartate replacements at H207 and H273 of PRD-1 and at H327 of PRD-2 are inhibitory to Mga-dependent gene expression. Conversely, non-phosphorylation-mimicking alanine substitutions at H273 and H327 relieved inhibition, and the mutant strains exhibited a wild-type phenotype. The opposing regulatory profiles observed for phosphorylation- and non-phosphorylation-mimicking substitutions at H273 extended to global gene regulation by Mga. Consistent with these observations, the H273D mutant strain attenuated GAS virulence, whereas the H273A strain exhibited a wild-type virulence phenotype in a mouse model of necrotizing fasciitis. Together, our results demonstrate phosphoregulation of Mga and its direct link to virulence in M59 GAS strains. These data also lay a foundation toward understanding how naturally occurring gain-of-function variations in mga, such as H201R, may confer an advantage to the pathogen and contribute to M59 GAS pathogenesis. PMID:25824840

  6. Phosphorylation Events in the Multiple Gene Regulator of Group A Streptococcus Significantly Influence Global Gene Expression and Virulence

    PubMed Central

    Sanson, Misu; Makthal, Nishanth; Gavagan, Maire; Cantu, Concepcion; Olsen, Randall J.; Musser, James M.

    2015-01-01

    Whole-genome sequencing analysis of ∼800 strains of group A Streptococcus (GAS) found that the gene encoding the multiple virulence gene regulator of GAS (mga) is highly polymorphic in serotype M59 strains but not in strains of other serotypes. To help understand the molecular mechanism of gene regulation by Mga and its contribution to GAS pathogenesis in serotype M59 GAS, we constructed an isogenic mga mutant strain. Transcriptome studies indicated a significant regulatory influence of Mga and altered metabolic capabilities conferred by Mga-regulated genes. We assessed the phosphorylation status of Mga in GAS cell lysates with Phos-tag gels. The results revealed that Mga is phosphorylated at histidines in vivo. Using phosphomimetic and nonphosphomimetic substitutions at conserved phosphoenolpyruvate:carbohydrate phosphotransferase regulation domain (PRD) histidines of Mga, we demonstrated that phosphorylation-mimicking aspartate replacements at H207 and H273 of PRD-1 and at H327 of PRD-2 are inhibitory to Mga-dependent gene expression. Conversely, non-phosphorylation-mimicking alanine substitutions at H273 and H327 relieved inhibition, and the mutant strains exhibited a wild-type phenotype. The opposing regulatory profiles observed for phosphorylation- and non-phosphorylation-mimicking substitutions at H273 extended to global gene regulation by Mga. Consistent with these observations, the H273D mutant strain attenuated GAS virulence, whereas the H273A strain exhibited a wild-type virulence phenotype in a mouse model of necrotizing fasciitis. Together, our results demonstrate phosphoregulation of Mga and its direct link to virulence in M59 GAS strains. These data also lay a foundation toward understanding how naturally occurring gain-of-function variations in mga, such as H201R, may confer an advantage to the pathogen and contribute to M59 GAS pathogenesis. PMID:25824840

  7. Temporal representation for gene networks: towards a qualitative temporal data mining.

    PubMed

    Turenne, Nicolas; Schwer, Sylviane R

    2008-01-01

    Processing literature (i.e., text corpora) to capture gene regulation events is not easy and can be driven by the final data representation. We propose to build, manually, an example of temporal representation (whole gene networks for coat formation in Bacillus Subtilis). Our temporal representation is based on a generalised formal language theory (S-languages). We propose an algorithm to link bags of relations with representation, by ordering interactions. In this paper, starting from the network made manually from text data, we show that S-languages are quite relevant to encapsulate gene properties, and infer knowledge across timestamped gene relations found in texts. PMID:18399327

  8. Functional Metagenome Mining of Soil for a Novel Gentamicin Resistance Gene.

    PubMed

    Im, Hyunjoo; Kim, Kyung Mo; Lee, Sang-Heon; Ryu, Choong-Min

    2016-03-01

    Extensive use of antibiotics over recent decades has led to bacterial resistance against antibiotics, including gentamicin, one of the most effective aminoglycosides. The emergence of resistance is problematic for hospitals, since gentamicin is an important broad-spectrum antibiotic for the control of bacterial pathogens in the clinic. Previous study to identify gentamicin resistance genes from environmental samples have been conducted using culture-dependent screening methods. To overcome these limitations, we employed a metagenome-based culture-independent protocol to identify gentamicin resistance genes. Through functional screening of metagenome libraries derived from soil samples, a fosmid clone was selected as it conferred strong gentamicin resistance. To identify a specific functioning gene conferring gentamicin resistance from a selected fosmid clone (35-40 kb), a shot-gun library was constructed and four shot-gun clones (2-3 kb) were selected. Further characterization of these clones revealed that they contained sequences similar to that of the RNA ligase, T4 rnlA that is known as a toxin gene. The overexpression of the rnlA-like gene in Escherichia coli increased gentamicin resistance, indicating that this toxin gene modulates this trait. The results of our metagenome library analysis suggest that the rnlA-like gene may represent a new class of gentamicin resistance genes in pathogenic bacteria. In addition, we demonstrate that the soil metagenome can provide an important resource for the identification of antibiotic resistance genes, which are valuable molecular targets in efforts to overcome antibiotic resistance. PMID:26699755

  9. Mining for Nonribosomal Peptide Synthetase and Polyketide Synthase Genes Revealed a High Level of Diversity in the Sphagnum Bog Metagenome

    PubMed Central

    Müller, Christina A.; Oberauner-Wappis, Lisa; Peyman, Armin; Amos, Gregory C. A.; Wellington, Elizabeth M. H.

    2015-01-01

    Sphagnum bog ecosystems are among the oldest vegetation forms harboring a specific microbial community and are known to produce an exceptionally wide variety of bioactive substances. Although the Sphagnum metagenome shows a rich secondary metabolism, the genes have not yet been explored. To analyze nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), the diversity of NRPS and PKS genes in Sphagnum-associated metagenomes was investigated by in silico data mining and sequence-based screening (PCR amplification of 9,500 fosmid clones). The in silico Illumina-based metagenomic approach resulted in the identification of 279 NRPSs and 346 PKSs, as well as 40 PKS-NRPS hybrid gene sequences. The occurrence of NRPS sequences was strongly dominated by the members of the Protebacteria phylum, especially by species of the Burkholderia genus, while PKS sequences were mainly affiliated with Actinobacteria. Thirteen novel NRPS-related sequences were identified by PCR amplification screening, displaying amino acid identities of 48% to 91% to annotated sequences of members of the phyla Proteobacteria, Actinobacteria, and Cyanobacteria. Some of the identified metagenomic clones showed the closest similarity to peptide synthases from Burkholderia or Lysobacter, which are emerging bacterial sources of as-yet-undescribed bioactive metabolites. This report highlights the role of the extreme natural ecosystems as a promising source for detection of secondary compounds and enzymes, serving as a source for biotechnological applications. PMID:26002894

  10. Mining for Nonribosomal Peptide Synthetase and Polyketide Synthase Genes Revealed a High Level of Diversity in the Sphagnum Bog Metagenome.

    PubMed

    Müller, Christina A; Oberauner-Wappis, Lisa; Peyman, Armin; Amos, Gregory C A; Wellington, Elizabeth M H; Berg, Gabriele

    2015-08-01

    Sphagnum bog ecosystems are among the oldest vegetation forms harboring a specific microbial community and are known to produce an exceptionally wide variety of bioactive substances. Although the Sphagnum metagenome shows a rich secondary metabolism, the genes have not yet been explored. To analyze nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), the diversity of NRPS and PKS genes in Sphagnum-associated metagenomes was investigated by in silico data mining and sequence-based screening (PCR amplification of 9,500 fosmid clones). The in silico Illumina-based metagenomic approach resulted in the identification of 279 NRPSs and 346 PKSs, as well as 40 PKS-NRPS hybrid gene sequences. The occurrence of NRPS sequences was strongly dominated by the members of the Protebacteria phylum, especially by species of the Burkholderia genus, while PKS sequences were mainly affiliated with Actinobacteria. Thirteen novel NRPS-related sequences were identified by PCR amplification screening, displaying amino acid identities of 48% to 91% to annotated sequences of members of the phyla Proteobacteria, Actinobacteria, and Cyanobacteria. Some of the identified metagenomic clones showed the closest similarity to peptide synthases from Burkholderia or Lysobacter, which are emerging bacterial sources of as-yet-undescribed bioactive metabolites. This report highlights the role of the extreme natural ecosystems as a promising source for detection of secondary compounds and enzymes, serving as a source for biotechnological applications. PMID:26002894

  11. Mining for Candidate Genes in an Introgression Line by Using RNA Sequencing: The Anthocyanin Overaccumulation Phenotype in Brassica.

    PubMed

    Xie, Lulu; Li, Fei; Zhang, Shifan; Zhang, Hui; Qian, Wei; Li, Peirong; Zhang, Shujiang; Sun, Rifei

    2016-01-01

    Introgression breeding is a widely used method for the genetic improvement of crop plants; however, the mechanism underlying candidate gene flow patterns during hybridization is poorly understood. In this study, we used a powerful pipeline to investigate a Chinese cabbage (Brassica rapa L. ssp. pekinensis) introgression line with the anthocyanin overaccumulation phenotype. Our purpose was to analyze the gene flow patterns during hybridization and elucidate the genetic factors responsible for the accumulation of this important pigment compound. We performed RNA-seq analysis by using two pipelines, one with and one without a reference sequence, to obtain transcriptome data. We identified 930 significantly differentially expressed genes (DEGs) between the purple-leaf introgression line and B. rapa green cultivar, namely, 389 up-regulated and 541 down-regulated DEGs that mapped to the B. rapa reference genome. Since only one anthocyanin pathway regulatory gene was identified, i.e., Bra037887 (bHLH), we mined unmapped reads, revealing 2031 de novo assembled unigenes, including c3563g1i2. Phylogenetic analysis suggested that c3563g1i2, which was transferred from the Brassica B genome of the donor parental line Brassica juncea, may represent an R2R3-MYB transcription factor that participates in the ternary transcriptional activation complex responsible for the anthocyanin overaccumulation phenotype of the B. rapa introgression line. We also identified genes involved in cold and light reaction pathways that were highly upregulated in the introgression line, as confirmed using quantitative real-time PCR analysis. The results of this study shed light on the mechanisms underlying the purple leaf trait in Brassica plants and may facilitate the use of introgressive hybridization for many traits of interest. PMID:27597857

  12. Mining for Candidate Genes in an Introgression Line by Using RNA Sequencing: The Anthocyanin Overaccumulation Phenotype in Brassica

    PubMed Central

    Xie, Lulu; Li, Fei; Zhang, Shifan; Zhang, Hui; Qian, Wei; Li, Peirong; Zhang, Shujiang; Sun, Rifei

    2016-01-01

    Introgression breeding is a widely used method for the genetic improvement of crop plants; however, the mechanism underlying candidate gene flow patterns during hybridization is poorly understood. In this study, we used a powerful pipeline to investigate a Chinese cabbage (Brassica rapa L. ssp. pekinensis) introgression line with the anthocyanin overaccumulation phenotype. Our purpose was to analyze the gene flow patterns during hybridization and elucidate the genetic factors responsible for the accumulation of this important pigment compound. We performed RNA-seq analysis by using two pipelines, one with and one without a reference sequence, to obtain transcriptome data. We identified 930 significantly differentially expressed genes (DEGs) between the purple-leaf introgression line and B. rapa green cultivar, namely, 389 up-regulated and 541 down-regulated DEGs that mapped to the B. rapa reference genome. Since only one anthocyanin pathway regulatory gene was identified, i.e., Bra037887 (bHLH), we mined unmapped reads, revealing 2031 de novo assembled unigenes, including c3563g1i2. Phylogenetic analysis suggested that c3563g1i2, which was transferred from the Brassica B genome of the donor parental line Brassica juncea, may represent an R2R3-MYB transcription factor that participates in the ternary transcriptional activation complex responsible for the anthocyanin overaccumulation phenotype of the B. rapa introgression line. We also identified genes involved in cold and light reaction pathways that were highly upregulated in the introgression line, as confirmed using quantitative real-time PCR analysis. The results of this study shed light on the mechanisms underlying the purple leaf trait in Brassica plants and may facilitate the use of introgressive hybridization for many traits of interest. PMID:27597857

  13. Global mining risk footprint of critical metals necessary for low-carbon technologies: the case of neodymium, cobalt, and platinum in Japan.

    PubMed

    Nansai, Keisuke; Nakajima, Kenichi; Kagawa, Shigemi; Kondo, Yasushi; Shigetomi, Yosuke; Suh, Sangwon

    2015-02-17

    Meeting the 2-degree global warming target requires wide adoption of low-carbon energy technologies. Many such technologies rely on the use of precious metals, however, increasing the dependence of national economies on these resources. Among such metals, those with supply security concerns are referred to as critical metals. Using the Policy Potential Index developed by the Fraser Institute, this study developed a new footprint indicator, the mining risk footprint (MRF), to quantify the mining risk directly and indirectly affecting a national economy through its consumption of critical metals. We formulated the MRF as a product of the material footprint (MF) of the consuming country and the mining risks of the countries where the materials are mined. A case study was conducted for the 2005 Japanese economy to determine the MF and MRF for three critical metals essential for emerging energy technologies: neodymium, cobalt and platinum. The results indicate that in 2005 the MFs generated by Japanese domestic final demand, that is, the consumption-based metal output of Japan, were 1.0 × 10(3) t for neodymium, 9.4 × 10(3) t for cobalt, and 2.1 × 10 t for platinum. Export demand contributes most to the MF, accounting for 3.0 × 10(3) t, 1.3 × 10(5) t, and 3.1 × 10 t, respectively. The MRFs of Japanese total final demand (domestic plus export) were calculated to be 1.7 × 10 points for neodymium, 4.5 × 10(-2) points for cobalt, and 5.6 points for platinum, implying that the Japanese economy is incurring a high mining risk through its use of neodymium. This country's MRFs are all dominated by export demand. The paper concludes by discussing the policy implications and future research directions for measuring the MFs and MRFs of critical metals. For countries poorly endowed with mineral resources, adopting low-carbon energy technologies may imply a shifting of risk from carbon resources to other natural resources, in particular critical metals, and a trade

  14. Manteia, a predictive data mining system for vertebrate genes and its applications to human genetic diseases

    PubMed Central

    Tassy, Olivier; Pourquié, Olivier

    2014-01-01

    The function of genes is often evolutionarily conserved, and comparing the annotation of ortholog genes in different model organisms has proved to be a powerful predictive tool to identify the function of human genes. Here, we describe Manteia, a resource available online at http://manteia.igbmc.fr. Manteia allows the comparison of embryological, expression, molecular and etiological data from human, mouse, chicken and zebrafish simultaneously to identify new functional and structural correlations and gene-disease associations. Manteia is particularly useful for the analysis of gene lists produced by high-throughput techniques such as microarrays or proteomics. Data can be easily analyzed statistically to characterize the function of groups of genes and to correlate the different aspects of their annotation. Sophisticated querying tools provide unlimited ways to merge the information contained in Manteia along with the possibility of introducing custom user-designed biological questions into the system. This allows for example to connect all the animal experimental results and annotations to the human genome, and take advantage of data not available for human to look for candidate genes responsible for genetic disorders. Here, we demonstrate the predictive and analytical power of the system to predict candidate genes responsible for human genetic diseases. PMID:24038354

  15. Manteia, a predictive data mining system for vertebrate genes and its applications to human genetic diseases.

    PubMed

    Tassy, Olivier; Pourquié, Olivier

    2014-01-01

    The function of genes is often evolutionarily conserved, and comparing the annotation of ortholog genes in different model organisms has proved to be a powerful predictive tool to identify the function of human genes. Here, we describe Manteia, a resource available online at http://manteia.igbmc.fr. Manteia allows the comparison of embryological, expression, molecular and etiological data from human, mouse, chicken and zebrafish simultaneously to identify new functional and structural correlations and gene-disease associations. Manteia is particularly useful for the analysis of gene lists produced by high-throughput techniques such as microarrays or proteomics. Data can be easily analyzed statistically to characterize the function of groups of genes and to correlate the different aspects of their annotation. Sophisticated querying tools provide unlimited ways to merge the information contained in Manteia along with the possibility of introducing custom user-designed biological questions into the system. This allows for example to connect all the animal experimental results and annotations to the human genome, and take advantage of data not available for human to look for candidate genes responsible for genetic disorders. Here, we demonstrate the predictive and analytical power of the system to predict candidate genes responsible for human genetic diseases. PMID:24038354

  16. CovR-controlled global regulation of gene expression in Streptococcus mutans.

    PubMed

    Dmitriev, Alexander; Mohapatra, Saswat S; Chong, Patrick; Neely, Melody; Biswas, Saswati; Biswas, Indranil

    2011-01-01

    CovR/S is a two-component signal transduction system (TCS) that controls the expression of various virulence related genes in many streptococci. However, in the dental pathogen Streptococcus mutans, the response regulator CovR appears to be an orphan since the cognate sensor kinase CovS is absent. In this study, we explored the global transcriptional regulation by CovR in S. mutans. Comparison of the transcriptome profiles of the wild-type strain UA159 with its isogenic covR deleted strain IBS10 indicated that at least 128 genes (∼6.5% of the genome) were differentially regulated. Among these genes, 69 were down regulated, while 59 were up regulated in the IBS10 strain. The S. mutans CovR regulon included competence genes, virulence related genes, and genes encoded within two genomic islands (GI). Genes encoded by the GI TnSmu2 were found to be dramatically reduced in IBS10, while genes encoded by the GI TnSmu1 were up regulated in the mutant. The microarray data were further confirmed by real-time RT-PCR analyses. Furthermore, direct regulation of some of the differentially expressed genes was demonstrated by electrophoretic mobility shift assays using purified CovR protein. A proteomic study was also carried out that showed a general perturbation of protein expression in the mutant strain. Our results indicate that CovR truly plays a significant role in the regulation of several virulence related traits in this pathogenic streptococcus. PMID:21655290

  17. Effect of starvation on global gene expression and proteolysis in rainbow trout (Oncorhynchus mykiss)

    PubMed Central

    Salem, Mohamed; Silverstein, Jeff; Rexroad, Caird E; Yao, Jianbo

    2007-01-01

    Background Fast, efficiently growing animals have increased protein synthesis and/or reduced protein degradation relative to slow, inefficiently growing animals. Consequently, minimizing the energetic cost of protein turnover is a strategic goal for enhancing animal growth. Characterization of gene expression profiles associated with protein turnover would allow us to identify genes that could potentially be used as molecular biomarkers to select for germplasm with improved protein accretion. Results We evaluated changes in hepatic global gene expression in response to 3-week starvation in rainbow trout (Oncorhynchus mykiss). Microarray analysis revealed a coordinated, down-regulated expression of protein biosynthesis genes in starved fish. In addition, the expression of genes involved in lipid metabolism/transport, aerobic respiration, blood functions and immune response were decreased in response to starvation. However, the microarray approach did not show a significant increase of gene expression in protein catabolic pathways. Further studies, using real-time PCR and enzyme activity assays, were performed to investigate the expression of genes involved in the major proteolytic pathways including calpains, the multi-catalytic proteasome and cathepsins. Starvation reduced mRNA expression of the calpain inhibitor, calpastatin long isoform (CAST-L), with a subsequent increase in the calpain catalytic activity. In addition, starvation caused a slight but significant increase in 20S proteasome activity without affecting mRNA levels of the proteasome genes. Neither the mRNA levels nor the activities of cathepsin D and L were affected by starvation. Conclusion These results suggest a significant role of calpain and 20S proteasome pathways in protein mobilization as a source of energy during fasting and a potential association of the CAST-L gene with fish protein accretion. PMID:17880706

  18. CovR-Controlled Global Regulation of Gene Expression in Streptococcus mutans

    PubMed Central

    Dmitriev, Alexander; Mohapatra, Saswat S.; Chong, Patrick; Neely, Melody; Biswas, Saswati; Biswas, Indranil

    2011-01-01

    CovR/S is a two-component signal transduction system (TCS) that controls the expression of various virulence related genes in many streptococci. However, in the dental pathogen Streptococcus mutans, the response regulator CovR appears to be an orphan since the cognate sensor kinase CovS is absent. In this study, we explored the global transcriptional regulation by CovR in S. mutans. Comparison of the transcriptome profiles of the wild-type strain UA159 with its isogenic covR deleted strain IBS10 indicated that at least 128 genes (∼6.5% of the genome) were differentially regulated. Among these genes, 69 were down regulated, while 59 were up regulated in the IBS10 strain. The S. mutans CovR regulon included competence genes, virulence related genes, and genes encoded within two genomic islands (GI). Genes encoded by the GI TnSmu2 were found to be dramatically reduced in IBS10, while genes encoded by the GI TnSmu1 were up regulated in the mutant. The microarray data were further confirmed by real-time RT-PCR analyses. Furthermore, direct regulation of some of the differentially expressed genes was demonstrated by electrophoretic mobility shift assays using purified CovR protein. A proteomic study was also carried out that showed a general perturbation of protein expression in the mutant strain. Our results indicate that CovR truly plays a significant role in the regulation of several virulence related traits in this pathogenic streptococcus. PMID:21655290

  19. Alcohol consumption induces global gene expression changes in VTA dopaminergic neurons.

    PubMed

    Marballi, K; Genabai, N K; Blednov, Y A; Harris, R A; Ponomarev, I

    2016-03-01

    Alcoholism is associated with dysregulation in the neural circuitry that mediates motivated and goal-directed behaviors. The dopaminergic (DA) connection between the ventral tegmental area (VTA) and the nucleus accumbens is viewed as a critical component of the neurocircuitry mediating alcohol's rewarding and behavioral effects. We sought to determine the effects of binge alcohol drinking on global gene expression in VTA DA neurons. Alcohol-preferring C57BL/6J × FVB/NJ F1 hybrid female mice were exposed to a modified drinking in the dark (DID) procedure for 3 weeks, while control animals had access to water only. Global gene expression of laser-captured tyrosine hydroxylase (TH)-positive VTA DA neurons was measured using microarrays. A total of 644 transcripts were differentially expressed between the drinking and nondrinking mice, and 930 transcripts correlated with alcohol intake during the last 2 days of drinking in the alcohol group. Bioinformatics analysis of alcohol-responsive genes identified molecular pathways and networks perturbed in DA neurons by alcohol consumption, which included neuroimmune and epigenetic functions, alcohol metabolism and brain disorders. The majority of genes with high and specific expression in DA neurons were downregulated by or negatively correlated with alcohol consumption, suggesting a decreased activity of DA neurons in high drinking animals. These changes in the DA transcriptome provide a foundation for alcohol-induced neuroadaptations that may play a crucial role in the transition to addiction. PMID:26482798

  20. Global gene expression analysis of chicken caecal response to Campylobacter jejuni.

    PubMed

    Shaughnessy, Ronan G; Meade, Kieran G; McGivney, Beatrice A; Allan, Brenda; O'Farrelly, Cliona

    2011-07-15

    Campylobacter jejuni colonises the caecum of more than 90% of commercial chickens. Even though colonisation is asymptomatic, we hypothesised that it is mediated by activation of several biological pathways. We therefore used chicken-specific 20K oligonucleotide microarrays to examine global gene expression in C. jejuni-challenged birds. Microarray results demonstrate small but significant fold-changes in expression of 270 genes 20 h post-challenge, corresponding to a wide range of biological processes including cell growth, nutrient metabolism and immunological activity. Expression of NOX1 (2.3-fold) and VCAM1 (1.5-fold) were significantly increased in colonised birds (P<0.05), indicating oxidative burst and endothelial cell activation, respectively. Microarray results, supplemented by qRT-PCR analyses demonstrated increased TOPK (1.9-fold), IL17 (3.6-fold), IL21 (2.1-fold), IL7R (4-fold) and CTLA4 (2.5-fold) gene expression (P<0.05), which was suggestive of T cell mediated activity. Combined these results suggest that C. jejuni has nominal effects on global caecal gene expression in the chicken but significant changes detected are suggestive of a protective intestinal T cell response. PMID:21605915

  1. Global brain delivery of neprilysin gene by intravascular administration of AAV vector in mice

    PubMed Central

    Iwata, Nobuhisa; Sekiguchi, Misaki; Hattori, Yoshino; Takahashi, Akane; Asai, Masashi; Ji, Bin; Higuchi, Makoto; Staufenbiel, Matthias; Muramatsu, Shin-ichi; Saido, Takaomi C.

    2013-01-01

    Accumulation of amyloid-β peptide (Aβ) in the brain is closely associated with cognitive decline in Alzheimer's disease (AD). Stereotaxic infusion of neprilysin-encoding viral vectors into the hippocampus has been shown to decrease Aβ in AD-model mice, but more efficient and global delivery is necessary to treat the broadly distributed burden in AD. Here we developed an adeno-associated virus (AAV) vector capable of providing neuronal gene expression throughout the brains after peripheral administration. A single intracardiac administration of the vector carrying neprilysin gene in AD-model mice elevated neprilysin activity broadly in the brain, and reduced Aβ oligomers, with concurrent alleviation of abnormal learning and memory function and improvement of amyloid burden. The exogenous neprilysin was localized mainly in endosomes, thereby effectively excluding Aβ oligomers from the brain. AAV vector-mediated gene transfer may provide a therapeutic strategy for neurodegenerative diseases, where global transduction of a therapeutic gene into the brain is necessary. PMID:23503602

  2. Global profiling of Shewanella oneidensis MR-1: Expression of hypothetical genes and improved functional annotations

    SciTech Connect

    Picone, Alex F.; Galperin, Michael Y.; Romine, Margaret; Higdon, Roger; Makarova, Kira S.; Kolker, Natali; Anderson, Gordon A; Qiu, Xiaoyun; Babnigg, Gyorgy; Beliaev, Alexander S; Edlefsen, Paul; Elias, Dwayne A.; Gorby, Dr. Yuri A.; Holzman, Ted; Klappenbach, Joel; Konstantinidis, Konstantinos T; Land, Miriam L; Lipton, Mary S.; McCue, Lee Ann; Monroe, Matthew; Pasa-Tolic, Ljiljana; Pinchuk, Grigoriy; Purvine, Samuel; Serres, Margrethe H.; Tsapin, Sasha; Zakrajsek, Brian A.; Zhu, Wenguang; Zhou, Jizhong; Larimer, Frank W; Lawrence, Charles E.; Riley, Monica; Collart, Frank; YatesIII, John R.; Smith, Richard D.; Nealson, Kenneth H.; Fredrickson, James K; Tiedje, James M.

    2005-01-01

    The gamma-proteobacterium Shewanella oneidensis strain MR-1 is a metabolically versatile organism that can reduce a wide range of organic compounds, metal ions, and radionuclides. Similar to most other sequenced organisms, approximate to40% of the predicted ORFs in the S. oneidensis genome were annotated as uncharacterized "hypothetical" genes. We implemented an integrative approach by using experimental and computational analyses to provide more detailed insight into gene function. Global expression profiles were determined for cells after UV irradiation and under aerobic and suboxic growth conditions. Transcriptomic and proteomic analyses confidently identified 538 hypothetical genes as expressed in S. oneidensis cells both as mRNAs and proteins (33% of all predicted hypothetical proteins). Publicly available analysis tools and databases and the expression data were applied to improve the annotation of these genes. The annotation results were scored by using a seven-category schema that ranked both confidence and precision of the functional assignment. We were able to identify homologs for nearly all of these hypothetical proteins (97%), but could confidently assign exact biochemical functions for only 16 proteins (category 1; 3%). Altogether, computational and experimental evidence provided functional assignments or insights for 240 more genes (categories 2-5; 45%). These functional annotations advance our understanding of genes involved in vital cellular processes, including energy conversion, ion transport, secondary metabolism, and signal transduction. We propose that this integrative approach offers a valuable means to undertake the enormous challenge of characterizing the rapidly growing number of hypothetical proteins with each newly sequenced genome.

  3. Novel phenotypes of Escherichia coli tat mutants revealed by global gene expression and phenotypic analysis.

    PubMed

    Ize, Bérengère; Porcelli, Ida; Lucchini, Sacha; Hinton, Jay C; Berks, Ben C; Palmer, Tracy

    2004-11-12

    The Tat protein export system serves to export folded proteins harboring an N-terminal twin arginine signal peptide across the cytoplasmic membrane. In this study, we have used gene expression profiling of Escherichia coli supported by phenotypic analysis to investigate how cells respond to a defect in the Tat pathway. Previous work has demonstrated that strains mutated in genes encoding essential Tat pathway components are defective in the integrity of their cell envelope because of the mislocalization of two amidases involved in cell wall metabolism (Ize, B., Stanley, N. R., Buchanan, G., and Palmer, T. (2003) Mol. Microbiol. 48, 1183-1193). To distinguish between genes that are differentially expressed specifically because of the cell envelope defect and those that result from other effects of the tatC deletion, we also analyzed two different transposon mutants of the DeltatatC strain that have their outer membrane integrity restored. Approximately 50% of the genes that were differentially expressed in the tatC mutant are linked to the envelope defect, with the products of many of these genes involved in self-defense or protection mechanisms, including the production of exopolysaccharide. Among the changes that were not explicitly linked to envelope integrity, we characterized a role for the Tat system in iron acquisition and copper homeostasis. Finally, we have demonstrated that overproduction of the Tat substrate SufI saturates the Tat translocon and produces effects on global gene expression that are similar to those resulting from the DeltatatC mutation. PMID:15347649

  4. Global Gene Expression Profiling in R155H Knock-In Murine Model of VCP Disease

    PubMed Central

    Nalbandian, Angèle; Ghimbovschi, Svetlana; Wang, Zuyi; Knoblach, Susan; Llewellyn, Katrina J.; Vesa, Jouni; Hoffman, Eric P.; Kimonis, Virginia E.

    2014-01-01

    Dominant mutations in the valosin containing protein (VCP) gene cause inclusion body myopathy associated with Paget disease of bone and frontotemporal dementia (IBMPFD), which is characterized by progressive muscle weakness, dysfunction in bone remodeling, and frontotemporal dementia. More recently, VCP has been linked to 2% of familial amyotrophic lateral sclerosis (ALS) cases. VCP plays a significant role in a plethora of cellular functions including membrane fusion, transcription activation, nuclear envelope reconstruction, post-mitotic organelle reassembly, cell cycle control. To elucidate the pathological mechanisms underlying the VCP disease progression, we have previously generated a VCPR155H/+ mouse model with the R155H mutation. Histological analyses of mutant muscle showed vacuolization of myofibrils, centrally located nuclei, and disorganized muscle fibers. Global expression profiling of VCPR155H/+ mice using gene annotations by DAVID identified key dysregulated signaling pathways including genes involved in the physiological system development and function, diseases and disorders, and molecular and cellular functions. There were a total of 212 significantly dysregulated genes, several of which are involved in the regulation of proteasomal function and NF-κB signaling cascade. Findings of the gene expression study were validated by using quantitative reverse transcriptase polymerase chain reaction analyses to test genes involved in various signaling cascades. This investigation reveals the importance of the VCPR155H/+ mouse model in the understanding of cellular and molecular mechanisms causing VCP-associated neurodegenerative diseases and in the discovery of novel therapeutic advancements and strategies for patients suffering with these debilitating disorders. PMID:25388089

  5. Interaction between bisphenol A and dietary sugar affects global gene transcription in Drosophila melanogaster

    PubMed Central

    Branco, Alan T.; Lemos, Bernardo

    2014-01-01

    Human exposure to environmental toxins is a public health issue. The microarray data available in the Gene Expression Omnibus database under accession number GSE55655 and GSE55670GSE55655GSE55670 show the isolated and combined effects of dietary sugar and two organic compounds present in a variety of plastics [bisphenol A (BPA) and Bis(2-ethylhexyl) phthalate (DEHP)] on global gene expression in Drosophila melanogaster. The study was carried out with samples collected from flies exposed to these compounds for a limited period of time (48 h) in the adult stage, or throughout the entire development of the insect. The arrays were normalized using the limma/Bioconductor package. Differential expression was inferred using linear models in limma and BAGEL. The data show that each compound had its unique consequences to gene expression, and that the individual effect of each organic compound is maximized with the joint ingestion of dietary sugar. PMID:26484116

  6. GeoChip-Based Analysis of the Functional Gene Diversity and Metabolic Potential of Microbial Communities in Acid Mine Drainage▿ †

    PubMed Central

    Xie, Jianping; He, Zhili; Liu, Xinxing; Liu, Xueduan; Van Nostrand, Joy D.; Deng, Ye; Wu, Liyou; Zhou, Jizhong; Qiu, Guanzhou

    2011-01-01

    Acid mine drainage (AMD) is an extreme environment, usually with low pH and high concentrations of metals. Although the phylogenetic diversity of AMD microbial communities has been examined extensively, little is known about their functional gene diversity and metabolic potential. In this study, a comprehensive functional gene array (GeoChip 2.0) was used to analyze the functional diversity, composition, structure, and metabolic potential of AMD microbial communities from three copper mines in China. GeoChip data indicated that these microbial communities were functionally diverse as measured by the number of genes detected, gene overlapping, unique genes, and various diversity indices. Almost all key functional gene categories targeted by GeoChip 2.0 were detected in the AMD microbial communities, including carbon fixation, carbon degradation, methane generation, nitrogen fixation, nitrification, denitrification, ammonification, nitrogen reduction, sulfur metabolism, metal resistance, and organic contaminant degradation, which suggested that the functional gene diversity was higher than was previously thought. Mantel test results indicated that AMD microbial communities are shaped largely by surrounding environmental factors (e.g., S, Mg, and Cu). Functional genes (e.g., narG and norB) and several key functional processes (e.g., methane generation, ammonification, denitrification, sulfite reduction, and organic contaminant degradation) were significantly (P < 0.10) correlated with environmental variables. This study presents an overview of functional gene diversity and the structure of AMD microbial communities and also provides insights into our understanding of metabolic potential in AMD ecosystems. PMID:21097602

  7. Data Mining of Gene Arrays for Biomarkers of Survival in Ovarian Cancer

    PubMed Central

    Coveney, Clare; Boocock, David J.; Rees, Robert C.; Deen, Suha; Ball, Graham R.

    2015-01-01

    The expected five-year survival rate from a stage III ovarian cancer diagnosis is a mere 22%; this applies to the 7000 new cases diagnosed yearly in the UK. Stratification of patients with this heterogeneous disease, based on active molecular pathways, would aid a targeted treatment improving the prognosis for many cases. While hundreds of genes have been associated with ovarian cancer, few have yet been verified by peer research for clinical significance. Here, a meta-analysis approach was applied to two carefully selected gene expression microarray datasets. Artificial neural networks, Cox univariate survival analyses and T-tests identified genes whose expression was consistently and significantly associated with patient survival. The rigor of this experimental design increases confidence in the genes found to be of interest. A list of 56 genes were distilled from a potential 37,000 to be significantly related to survival in both datasets with a FDR of 1.39859 × 10−11, the identities of which both verify genes already implicated with this disease and provide novel genes and pathways to pursue. Further investigation and validation of these may lead to clinical insights and have potential to predict a patient’s response to treatment or be used as a novel target for therapy.

  8. Genome-Wide Linkage Analysis of Global Gene Expression in Loin Muscle Tissue Identifies Candidate Genes in Pigs

    PubMed Central

    Steibel, Juan Pedro; Bates, Ronald O.; Rosa, Guilherme J. M.; Tempelman, Robert J.; Rilington, Valencia D.; Ragavendran, Ashok; Raney, Nancy E.; Ramos, Antonio Marcos; Cardoso, Fernando F.; Edwards, David B.; Ernst, Catherine W.

    2011-01-01

    Background Nearly 6,000 QTL have been reported for 588 different traits in pigs, more than in any other livestock species. However, this effort has translated into only a few confirmed causative variants. A powerful strategy for revealing candidate genes involves expression QTL (eQTL) mapping, where the mRNA abundance of a set of transcripts is used as the response variable for a QTL scan. Methodology/Principal Findings We utilized a whole genome expression microarray and an F2 pig resource population to conduct a global eQTL analysis in loin muscle tissue, and compared results to previously inferred phenotypic QTL (pQTL) from the same experimental cross. We found 62 unique eQTL (FDR <10%) and identified 3 gene networks enriched with genes subject to genetic control involved in lipid metabolism, DNA replication, and cell cycle regulation. We observed strong evidence of local regulation (40 out of 59 eQTL with known genomic position) and compared these eQTL to pQTL to help identify potential candidate genes. Among the interesting associations, we found aldo-keto reductase 7A2 (AKR7A2) and thioredoxin domain containing 12 (TXNDC12) eQTL that are part of a network associated with lipid metabolism and in turn overlap with pQTL regions for marbling, % intramuscular fat (% fat) and loin muscle area on Sus scrofa (SSC) chromosome 6. Additionally, we report 13 genomic regions with overlapping eQTL and pQTL involving 14 local eQTL. Conclusions/Significance Results of this analysis provide novel candidate genes for important complex pig phenotypes. PMID:21346809

  9. The population genomics of begomoviruses: global scale population structure and gene flow

    PubMed Central

    2010-01-01

    Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We focus here on using a model-based population genetic approach to identify the genetically distinct sub-populations within the global begomovirus meta-population. We demonstrate the existence of at least seven major sub-populations that can further be sub-divided into as many as thirty four significantly differentiated and genetically cohesive minor sub-populations. Using the population structure framework revealed in the present study, we further explored the extent of gene flow and recombination between genetic populations. Conclusions Although geographical barriers are apparently the most significant underlying cause of the seven major population sub-divisions, within the framework of these sub-divisions, we explore patterns of gene flow to reveal that both host range differences and genetic barriers to recombination have probably been major contributors to the minor population sub-divisions that we have identified. We believe that the global Begomovirus population structure revealed here could facilitate population genetics studies

  10. Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation.

    PubMed

    Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard

    2014-01-01

    Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific

  11. Identification of Novel Cellular Targets in Biliary Tract Cancers Using Global Gene Expression Technology

    PubMed Central

    Hansel, Donna E.; Rahman, Ayman; Hidalgo, Manuel; Thuluvath, Paul J.; Lillemoe, Keith D.; Shulick, Richard; Ku, Ja-Lok; Park, Jae-Gahb; Miyazaki, Kohje; Ashfaq, Raheela; Wistuba, Ignacio I.; Varma, Ram; Hawthorne, Lesleyann; Geradts, Joseph; Argani, Pedram; Maitra, Anirban

    2003-01-01

    Biliary tract carcinoma carries a poor prognosis, and difficulties with clinical management in patients with advanced disease are often due to frequent late-stage diagnosis, lack of serum markers, and limited information regarding biliary tumor pathogenesis. RNA-based global analyses of gene expression have led to the identification of a large number of up-regulated genes in several cancer types. We have used the recently developed Affymetrix U133A gene expression microarrays containing nearly 22,000 unique transcripts to obtain global gene expression profiles from normal biliary epithelial scrapings (n = 5), surgically resected biliary carcinomas (n = 11), and biliary cancer cell lines (n = 9). Microarray hybridization data were normalized using dCHIP (http://www.dCHIP.org) to identify differentially up-regulated genes in primary biliary cancers and biliary cancer cell lines and their expression profiles was compared to that of normal epithelial scrapings using the dCHIP software as well as Significance Analysis of Microarrays or SAM (http://www-stat.stanford.edu/∼tibs/SAM/). Comparison of the dCHIP and SAM datasets revealed an overlapping list of 282 genes expressed at greater than threefold levels in the cancers compared to normal epithelium (t-test P <0.1 in dCHIP, and median false discovery rate <10 in SAM). Several pathways integral to tumorigenesis were up-regulated in the biliary cancers, including proliferation and cell cycle antigens (eg, cyclins D2 and E2, cdc2/p34, and geminin), transcription factors (eg, homeobox B7 and islet-1), growth factors and growth factor receptors (eg, hepatocyte growth factor, amphiregulin, and insulin-like growth factor 1 receptor), and enzymes modulating sensitivity to chemotherapeutic agents (eg, cystathionine β synthase, dCMP deaminase, and CTP synthase). In addition, we identified several “pathway” genes that are rapidly emerging as novel therapeutic targets in cancer (eg, cytosolic phospholipase A2, an upstream

  12. Allele Mining in Barley Genetic Resources Reveals Genes of Race-Non-Specific Powdery Mildew Resistance

    PubMed Central

    Spies, Annika; Korzun, Viktor; Bayles, Rosemary; Rajaraman, Jeyaraman; Himmelbach, Axel; Hedley, Pete E.; Schweizer, Patrick

    2012-01-01

    Race-non-specific, or quantitative, pathogen resistance is of high importance to plant breeders due to its expected durability. However, it is usually controlled by multiple quantitative trait loci (QTL) and therefore difficult to handle in practice. Knowing the genes that underlie race-non-specific resistance (NR) would allow its exploitation in a more targeted manner. Here, we performed an association-genetic study in a customized worldwide collection of spring barley accessions for candidate genes of race-NR to the powdery mildew fungus Blumeria graminis f. sp. hordei (Bgh) and combined data with results from QTL mapping as well as functional-genomics approaches. This led to the identification of 11 associated genes with converging evidence for an important role in race-NR in the presence of the Mlo gene for basal susceptibility. Outstanding in this respect was the gene encoding the transcription factor WRKY2. The results suggest that unlocking plant genetic resources and integrating functional-genomic with genetic approaches can accelerate the discovery of genes underlying race-NR in barley and other crop plants. PMID:22629270

  13. In Vitro Global Gene Expression Analyses Support the Ethnopharmacological Use of Achyranthes aspera

    PubMed Central

    Subbarayan, Pochi R.; Sarkar, Malancha; Lokeshwar, Balakrishna L.; Ardalan, Bach

    2013-01-01

    Achyranthes aspera (family Amaranthaceae) is known for its anticancer properties. We have systematically validated the in vitro and in vivo anticancer properties of this plant. However, we do not know its mode of action. Global gene expression analyses may help decipher its mode of action. In the absence of identified active molecules, we believe this is the best approach to discover the mode of action of natural products with known medicinal properties. We exposed human pancreatic cancer cell line MiaPaCa-2 (CRL-1420) to 34 μg/mL of LE for 24, 48, and 72 hours. Gene expression analyses were performed using whole human genome microarrays (Agilent Technologies, USA). In our analyses, 82 (54/28) genes passed the quality control parameter, set at FDR ≤ 0.01 and FC of ≥±2. LE predominantly affected pathways of immune response, metabolism, development, gene expression regulation, cell adhesion, cystic fibrosis transmembrane conductance regulation (CFTR), and chemotaxis (MetaCore tool (Thomson Reuters, NY)). Disease biomarker enrichment analysis identified LE regulated genes involved in Vasculitis—inflammation of blood vessels. Arthritis and pancreatitis are two of many etiologies for vasculitis. The outcome of disease network analysis supports the medicinal use of A. aspera, viz, to stop bleeding, as a cure for pancreatic cancer, as an antiarthritic medication, and so forth. PMID:24454496

  14. Impact of Hfq on Global Gene Expression and Virulence in Klebsiella pneumoniae

    PubMed Central

    Chiang, Ming-Ko; Lu, Min-Chi; Liu, Li-Cheng; Lin, Ching-Ting; Lai, Yi-Chyi

    2011-01-01

    Klebsiella pneumoniae is responsible for a wide range of clinical symptoms. How this bacterium adapts itself to ever-changing host milieu is still a mystery. Recently, small non-coding RNAs (sRNAs) have received considerable attention for their functions in fine-tuning gene expression at a post-transcriptional level to promote bacterial adaptation. Here we demonstrate that Hfq, an RNA-binding protein, which facilitates interactions between sRNAs and their mRNA targets, is critical for K. pneumoniae virulence. A K. pneumoniae mutant lacking hfq (Δhfq) failed to disseminate into extra-intestinal organs and was attenuated on induction of a systemic infection in a mouse model. The absence of Hfq was associated with alteration in composition of envelope proteins, increased production of capsular polysaccharides, and decreased resistance to H2O2, heat shock, and UV irradiation. Microarray-based transcriptome analyses revealed that 897 genes involved in numerous cellular processes were deregulated in the Δhfq strain. Interestingly, Hfq appeared to govern expression of many genes indirectly by affecting sigma factor RpoS and RpoE, since 19.5% (175/897) and 17.3% (155/897) of Hfq-dependent genes belong to the RpoE- and RpoS-regulon, respectively. These results indicate that Hfq regulates global gene expression at multiple levels to modulate the physiological fitness and virulence potential of K. pneumoniae. PMID:21779404

  15. Histone Modifications at Human Enhancers Reflect Global Cell Type-Specific Gene Expression

    PubMed Central

    Heintzman, Nathaniel D.; Hon, Gary C.; Hawkins, R. David; Kheradpour, Pouya; Stark, Alexander; Harp, Lindsey F.; Ye, Zhen; Lee, Leonard K.; Stuart, Rhona K.; Ching, Christina W.; Ching, Keith A.; Antosiewicz, Jessica E.; Liu, Hui; Zhang, Xinmin; Green, Roland D.; Stewart, Ron; Thomson, James A.; Crawford, Gregory E.; Kellis, Manolis; Ren, Bing

    2010-01-01

    The human body is composed of diverse cell types with distinct functions. While it is known that lineage specification depends on cell specific gene expression, which in turn is driven by promoters, enhancers, insulators and other cis-regulatory DNA sequences for each gene1–3, the relative roles of these regulatory elements in this process is not clear. We have previously developed a chromatin immunoprecipitation-based microarray method (ChIP-chip) to locate promoters, enhancers, and insulators in the human genome4–6. Here, we use the same approach to identify these elements in multiple cell types and investigated their roles in cell type-specific gene expression. We observed that chromatin state at promoters and CTCF-binding at insulators are largely invariant across diverse cell types. By contrast, enhancers are marked with highly cell type-specific histone modification patterns, strongly correlate to cell type-specific gene expression programs on a global scale, and are functionally active in a cell type-specific manner. Our results defined over 55,000 potential transcriptional enhancers in the human genome, significantly expanding the current catalog of human enhancers and highlighting the role of these elements in cell type-specific gene expression. PMID:19295514

  16. Global role for polyadenylation-assisted nuclear RNA degradation in posttranscriptional gene silencing.

    PubMed

    Wang, Shao-Win; Stevenson, Abigail L; Kearsey, Stephen E; Watt, Stephen; Bähler, Jürg

    2008-01-01

    Fission yeast Cid14, a component of the TRAMP (Cid14/Trf4-Air1-Mtr4 polyadenylation) complex, polyadenylates nuclear RNA and stimulates degradation by the exosome for RNA quality control. Here, we analyze patterns of global gene expression in cells lacking the Cid14 or the Dis3/Rpr44 subunit of the nuclear exosome. We found that transcripts from many genes induced during meiosis, including key regulators, accumulated in the absence of Cid14 or Dis3. Moreover, our data suggest that additional substrates include transcripts involved in heterochromatin assembly. Mutant cells lacking Cid14 and/or Dis3 accumulate transcripts corresponding to naturally silenced repeat elements within heterochromatic domains, reflecting defects in centromeric gene silencing and derepression of subtelomeric gene expression. We also uncover roles for Cid14 and Dis3 in maintaining the genomic integrity of ribosomal DNA. Our data indicate that polyadenylation-assisted nuclear RNA turnover functions in eliminating a variety of RNA targets to control diverse processes, such as heterochromatic gene silencing, meiotic differentiation, and maintenance of genomic integrity. PMID:18025105

  17. Mining, genetic mapping and expression analysis of EST-derived resistance gene homologs (RGHs) in cotton

    PubMed Central

    2014-01-01

    Background Cotton is the dominant textile crop and also serves as an important oil crop. An estimated 15% economic loss associated with cotton production in China has been caused by diseases, and no resistance genes have been cloned in this crop. Molecular markers developed from resistance gene homologues (RGHs) might be tightly linked with target genes and could be used for marker-assisted selection (MAS) or gene cloning. Results To genetically map expressed RGHs, 100 potential pathogenesis-related proteins (PRPs) and 215 resistance gene analogs (RGAs) were identified in the cotton expressed sequence tag database, and 347 specific primers were developed. Meanwhile, 61 cotton genome-derived RGA markers and 24 resistance gene analog polymorphism (RGAP) markers from published papers were included to view their genomic distribution. As a result, 38 EST-derived and 17 genome-derived RGH markers were added to our interspecific genetic map. These 55 markers were distributed on 18 of the 26 cotton chromosomes, with 34 markers on 6 chromosomes (Chr03, Chr04, Chr11, Chr17, Chr19 and Chr26). Homologous RGHs tended to be clustered; RGH clusters appeared on 9 chromosomes, with larger clusters on Chr03, Chr04 and Chr19, which suggests that RGH clusters are widely distributed in the cotton genome. Expression analysis showed that 19 RGHs were significantly altered after inoculation with the V991 stain of Verticillium dahliae. Comparative mapping showed that four RGH markers were linked with mapped loci for Verticillium wilt resistance. Conclusions The genetic mapping of RGHs confirmed their clustering in cotton genome. Expression analysis and comparative mapping suggest that EST-derived RGHs participate in cotton resistance. RGH markers are seemed to be useful tools to detected resistance loci and identify candidate resistance genes in cotton. PMID:25064562

  18. Circadian control of global gene expression by the cyanobacterial master regulator RpaA

    PubMed Central

    Markson, Joseph S.; Piechura, Joseph R.; Puszynska, Anna M.; O’Shea, Erin K.

    2014-01-01

    Summary The cyanobacterial circadian clock generates genome-wide transcriptional oscillations and regulates cell division, but the underlying mechanisms are not well understood. Here we show that the response regulator RpaA serves as the master regulator of these clock outputs. Deletion of rpaA abrogates gene expression rhythms globally and arrests cells in a dawn-like expression state. Although rpaA deletion causes core oscillator failure by perturbing clock gene expression, rescuing oscillator function does not restore global expression rhythms. We show that phosphorylated RpaA regulates the expression of not only clock components, generating feedback on the core oscillator, but also a small set of circadian effectors that in turn orchestrate genome-wide transcriptional rhythms. Expression of constitutively active RpaA is sufficient to switch cells from a dawn-like to a dusk-like expression state as well as to block cell division. Hence, complex global circadian phenotypes can be generated by controlling the phosphorylation of a single transcription factor. PMID:24315105

  19. Global gene expression responses to waterlogging in leaves of rape seedlings.

    PubMed

    Lee, Yong-Hwa; Kim, Kwang-Soo; Jang, Young-Seok; Hwang, Ji-Hye; Lee, Dong-Hee; Choi, In-Hu

    2014-02-01

    Soil waterlogging is a serious constraint to crop production. We investigated the physiological responses of rape (Brassica napus L.) seedlings to waterlogging stress and analyzed global gene transcription responses in the aerial leaves of waterlogged rape seedlings. Seedlings of 'Tammi' and 'Youngsan' cultivars were subjected to waterlogging for 3 and 6 days and recovery for 5 days. Waterlogging stress caused a significant decrease in leaf chlorophyll content and premature senescence of the leaves. Maximal quantum efficiency of PSII (F(v)/F(m)) decreased in the waterlogged seedlings compared with the control plants. To evaluate whether the observed physiological changes in the leaves are associated with the differential regulation of gene expression in response to waterlogging stress, we analyzed the global transcriptional profile of leaves of 'Tammi' seedlings that were exposed to waterlogging for a short period (36 and 72 h). SolexaQA RNA-seq analysis revealed that a total of 4,484 contigs (8.5 %) of all contigs assayed (52,747) showed a twofold change in expression after 36 h of the start of waterlogging and 9,659 contigs (18.3 %) showed a twofold change after 72 h. Major genes involved in leaf photosynthesis, including light reactions and carbon-fixing reactions, were downregulated, while a number of genes involved in the scavenging of reactive oxygen species, degradation (proteins, starch, and lipids), premature senescence, and abiotic stress tolerance were upregulated. Transcriptome analysis data suggested that the aerial leaves of waterlogged rape seedlings respond to hypoxia by regulating the expression of diverse genes in the leaves. PMID:24384821

  20. Global gene expression analyses of hematopoietic stem cell-like cell lines with inducible Lhx2 expression

    PubMed Central

    Richter, Karin; Wirta, Valtteri; Dahl, Lina; Bruce, Sara; Lundeberg, Joakim; Carlsson, Leif; Williams, Cecilia

    2006-01-01

    Background Expression of the LIM-homeobox gene Lhx2 in murine hematopoietic cells allows for the generation of hematopoietic stem cell (HSC)-like cell lines. To address the molecular basis of Lhx2 function, we generated HSC-like cell lines where Lhx2 expression is regulated by a tet-on system and hence dependent on the presence of doxycyclin (dox). These cell lines efficiently down-regulate Lhx2 expression upon dox withdrawal leading to a rapid differentiation into various myeloid cell types. Results Global gene expression of these cell lines cultured in dox was compared to different time points after dox withdrawal using microarray technology. We identified 267 differentially expressed genes. The majority of the genes overlapping with HSC-specific databases were those down-regulated after turning off Lhx2 expression and a majority of the genes overlapping with those defined as late progenitor-specific genes were the up-regulated genes, suggesting that these cell lines represent a relevant model system for normal HSCs also at the level of global gene expression. Moreover, in situ hybridisations of several genes down-regulated after dox withdrawal showed overlapping expression patterns with Lhx2 in various tissues during embryonic development. Conclusion Global gene expression analysis of HSC-like cell lines with inducible Lhx2 expression has identified genes putatively linked to self-renewal / differentiation of HSCs, and function of Lhx2 in organ development and stem / progenitor cells of non-hematopoietic origin. PMID:16600034

  1. Global map of physical interactions among differentially expressed genes in multiple sclerosis relapses and remissions.

    PubMed

    Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat

    2011-09-15

    Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us

  2. Global Transcriptome Analysis Reveals That Poly(ADP-Ribose) Polymerase 1 Regulates Gene Expression through EZH2

    PubMed Central

    Martin, Kayla A.; Cesaroni, Matteo; Denny, Michael F.; Lupey, Lena N.

    2015-01-01

    Posttranslational modifications, such as poly(ADP-ribosyl)ation (PARylation), regulate chromatin-modifying enzymes, ultimately affecting gene expression. This study explores the role of poly(ADP-ribose) polymerase (PARP) on global gene expression in a lymphoblastoid B cell line. We found that inhibition of PARP catalytic activity with olaparib resulted in global gene deregulation, affecting approximately 11% of the genes expressed. Gene ontology analysis revealed that PARP could exert these effects through transcription factors and chromatin-remodeling enzymes, including the polycomb repressive complex 2 (PRC2) member EZH2. EZH2 mediates the trimethylation of histone H3 at lysine 27 (H3K27me3), a modification associated with chromatin compaction and gene silencing. Both pharmacological inhibition of PARP and knockdown of PARP1 induced the expression of EZH2, which resulted in increased global H3K27me3. Chromatin immunoprecipitation confirmed that PARP1 inhibition led to H3K27me3 deposition at EZH2 target genes, which resulted in gene silencing. Moreover, increased EZH2 expression is attributed to the loss of the occupancy of the transcription repressor E2F4 at the EZH2 promoter following PARP inhibition. Together, these data show that PARP plays an important role in global gene regulation and identifies for the first time a direct role of PARP1 in regulating the expression and function of EZH2. PMID:26370511

  3. Global Transcriptome Analysis Reveals That Poly(ADP-Ribose) Polymerase 1 Regulates Gene Expression through EZH2.

    PubMed

    Martin, Kayla A; Cesaroni, Matteo; Denny, Michael F; Lupey, Lena N; Tempera, Italo

    2015-12-01

    Posttranslational modifications, such as poly(ADP-ribosyl)ation (PARylation), regulate chromatin-modifying enzymes, ultimately affecting gene expression. This study explores the role of poly(ADP-ribose) polymerase (PARP) on global gene expression in a lymphoblastoid B cell line. We found that inhibition of PARP catalytic activity with olaparib resulted in global gene deregulation, affecting approximately 11% of the genes expressed. Gene ontology analysis revealed that PARP could exert these effects through transcription factors and chromatin-remodeling enzymes, including the polycomb repressive complex 2 (PRC2) member EZH2. EZH2 mediates the trimethylation of histone H3 at lysine 27 (H3K27me3), a modification associated with chromatin compaction and gene silencing. Both pharmacological inhibition of PARP and knockdown of PARP1 induced the expression of EZH2, which resulted in increased global H3K27me3. Chromatin immunoprecipitation confirmed that PARP1 inhibition led to H3K27me3 deposition at EZH2 target genes, which resulted in gene silencing. Moreover, increased EZH2 expression is attributed to the loss of the occupancy of the transcription repressor E2F4 at the EZH2 promoter following PARP inhibition. Together, these data show that PARP plays an important role in global gene regulation and identifies for the first time a direct role of PARP1 in regulating the expression and function of EZH2. PMID:26370511

  4. Mining of luxS genes from rumen microbial consortia by metagenomic and metatranscriptomic approaches.

    PubMed

    Ghali, Ines; Shinkai, Takumi; Mitsumori, Makoto

    2016-05-01

    Although rumen bacterial communities vary depending on many factors such as diet, age and physiological conditions, a core microbiota exists within the rumen. In many natural environments, some bacteria use a quorum-sensing (QS) system to regulate their physiological activities. However, very limited information is available about QS systems in rumen. To investigate the autoinducer 2 (AI-2)-mediated QS system in rumen, we detected genes (luxS) encoding the AI-2 synthase (LuxS), from three datasets embedded in metagenomics RAST server (MG-RAST) and from a metatranscriptome dataset. We collected 135 luxS genes from the metagenomic datasets, which were presumed to originate from Bacteroidetes, Firmicutes, Fusobacteria and Actinobacteria, and 34 luxS genes from the metatranscriptome dataset, which probably originated from Bacteroidetes, Firmicutes and Spirochaetes. Because the essential amino acids for LuxS activity were conserved in the LuxS homologues predicted from luxS gene sequences from both datasets, the LuxS homologues probably function in the rumen. Since the largest number of sequences of luxS genes were collected from the genera Prevotella, Ruminococcus and Eubacterium, which include many fibrolytic bacteria and constituent members of biofilm on feed particles, an AI-2-mediated QS system is likely involved in biofilm formation and fibrolytic activity in the rumen. PMID:26277986

  5. Mining 3D Patterns from Gene Expression Temporal Data: A New Tricluster Evaluation Measure

    PubMed Central

    2014-01-01

    Microarrays have revolutionized biotechnological research. The analysis of new data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are applied to create groups of genes that exhibit a similar behavior. Biclustering emerges as a valuable tool for microarray data analysis since it relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. However, if a third dimension appears in the data, triclustering is the appropriate tool for the analysis. This occurs in longitudinal experiments in which the genes are evaluated under conditions at several time points. All clustering, biclustering, and triclustering techniques guide their search for solutions by a measure that evaluates the quality of clusters. We present an evaluation measure for triclusters called Mean Square Residue 3D. This measure is based on the classic biclustering measure Mean Square Residue. Mean Square Residue 3D has been applied to both synthetic and real data and it has proved to be capable of extracting groups of genes with homogeneous patterns in subsets of conditions and times, and these groups have shown a high correlation level and they are also related to their functional annotations extracted from the Gene Ontology project. PMID:25143987

  6. Mining 3D patterns from gene expression temporal data: a new tricluster evaluation measure.

    PubMed

    Gutiérrez-Avilés, David; Rubio-Escudero, Cristina

    2014-01-01

    Microarrays have revolutionized biotechnological research. The analysis of new data generated represents a computational challenge due to the characteristics of these data. Clustering techniques are applied to create groups of genes that exhibit a similar behavior. Biclustering emerges as a valuable tool for microarray data analysis since it relaxes the constraints for grouping, allowing genes to be evaluated only under a subset of the conditions. However, if a third dimension appears in the data, triclustering is the appropriate tool for the analysis. This occurs in longitudinal experiments in which the genes are evaluated under conditions at several time points. All clustering, biclustering, and triclustering techniques guide their search for solutions by a measure that evaluates the quality of clusters. We present an evaluation measure for triclusters called Mean Square Residue 3D. This measure is based on the classic biclustering measure Mean Square Residue. Mean Square Residue 3D has been applied to both synthetic and real data and it has proved to be capable of extracting groups of genes with homogeneous patterns in subsets of conditions and times, and these groups have shown a high correlation level and they are also related to their functional annotations extracted from the Gene Ontology project. PMID:25143987

  7. Morphological restriction of human coronary artery endothelial cells substantially impacts global gene expression patterns

    PubMed Central

    Stiles, Jessica M; Pham, Robert; Rowntree, Rebecca K; Amaya, Clarissa; Battiste, James; Boucheron, Laura E; Mitchell, Dianne C; Bryan, Brad A

    2013-01-01

    Alterations in cell shape have been shown to modulate chromatin condensation and cell lineage specification; however, the mechanisms controlling these processes are largely unknown. Because endothelial cells experience cyclic mechanical changes from blood flow during normal physiological processes and disrupted mechanical changes as a result of abnormal blood flow, cell shape deformation and loss of polarization during coronary artery disease, we aimed to determine how morphological restriction affects global gene expression patterns. Human coronary artery endothelial cells (HCAECs) were cultured on spatially defined adhesive micropatterns, forcing them to conform to unique cellular morphologies differing in cellular polarization and angularity. We utilized pattern recognition algorithms and statistical analysis to validate the cytoskeletal pattern reproducibility and uniqueness of each micropattern, and performed microarray analysis on normal-shaped and micropatterned HCAECs to determine how constrained cellular morphology affects gene expression patterns. Analysis of the data revealed that forcing HCAECs to conform to geometrically-defined shapes significantly affects their global transcription patterns compared to nonrestricted shapes. Interestingly, gene expression patterns were altered in response to morphological restriction in general, although they were consistent regardless of the particular shape the cells conformed to. These data suggest that the ability of HCAECs to spread, although not necessarily their particular morphology, dictates their genomics patterns. PMID:23802622

  8. Gene expression profiling--Opening the black box of plant ecosystem responses to global change

    SciTech Connect

    Leakey, A.D.B.; Ainsworth, E.A.; Bernard, S.M.; Markelz, R.J.C.; Ort, D.R.; Placella, S.A.P.; Rogers, A.; Smith, M.D.; Sudderth, E.A.; Weston, D.J.; Wullschleger, S.D.; Yuan, S.

    2009-11-01

    The use of genomic techniques to address ecological questions is emerging as the field of genomic ecology. Experimentation under environmentally realistic conditions to investigate the molecular response of plants to meaningful changes in growth conditions and ecological interactions is the defining feature of genomic ecology. Since the impact of global change factors on plant performance are mediated by direct effects at the molecular, biochemical and physiological scales, gene expression analysis promises important advances in understanding factors that have previously been consigned to the 'black box' of unknown mechanism. Various tools and approaches are available for assessing gene expression in model and non-model species as part of global change biology studies. Each approach has its own unique advantages and constraints. A first generation of genomic ecology studies in managed ecosystems and mesocosms have provided a testbed for the approach and have begun to reveal how the experimental design and data analysis of gene expression studies can be tailored for use in an ecological context.

  9. Gene Expression Profiling - Opening the Black Box of Plant Ecosystem Responses to Global Change

    SciTech Connect

    Ainsworth, Elizabeth A.; Bernard, Stephanie M.; Markelz, R.J. Cody; Ort, Donald R.; Placella, Sarah A.; Rogers, Alistair; Smith, Melinda D; Sudderth, Erika A.; Weston, David; Wullschleger, Stan D; Yuan, Shenghua

    2009-01-01

    The use of genomic techniques to address ecological questions is emerging as the field of genomic ecology. Experimentation under environmentally realistic conditions to investigate the molecular response of plants to meaningful changes in growth conditions and ecological interactions is the defining feature of genomic ecology. Since the impact of global change factors on plant performance are mediated by direct effects at the molecular, biochemical and physiological scales, gene expression analysis promises important advances in understanding factors that have previously been consigned to the black box of unknown mechanism. Various tools and approaches are available for assessing gene expression in model and non-model species as part of global change biology studies. Each approach has its own unique advantages and constraints. A first generation of genomic ecology studies in managed ecosystems and mesocosms have provided a testbed for the approach and have begun to reveal how the experimental design and data analysis of gene expression studies can be tailored for use in an ecological context.

  10. Global sensitivity analysis of a dynamic model for gene expression in Drosophila embryos

    PubMed Central

    McCarthy, Gregory D.; Drewell, Robert A.

    2015-01-01

    It is well known that gene regulation is a tightly controlled process in early organismal development. However, the roles of key processes involved in this regulation, such as transcription and translation, are less well understood, and mathematical modeling approaches in this field are still in their infancy. In recent studies, biologists have taken precise measurements of protein and mRNA abundance to determine the relative contributions of key factors involved in regulating protein levels in mammalian cells. We now approach this question from a mathematical modeling perspective. In this study, we use a simple dynamic mathematical model that incorporates terms representing transcription, translation, mRNA and protein decay, and diffusion in an early Drosophila embryo. We perform global sensitivity analyses on this model using various different initial conditions and spatial and temporal outputs. Our results indicate that transcription and translation are often the key parameters to determine protein abundance. This observation is in close agreement with the experimental results from mammalian cells for various initial conditions at particular time points, suggesting that a simple dynamic model can capture the qualitative behavior of a gene. Additionally, we find that parameter sensitivites are temporally dynamic, illustrating the importance of conducting a thorough global sensitivity analysis across multiple time points when analyzing mathematical models of gene regulation. PMID:26157608

  11. Extremely Acidophilic Protists from Acid Mine Drainage Host Rickettsiales-Lineage Endosymbionts That Have Intervening Sequences in Their 16S rRNA Genes

    PubMed Central

    Baker, Brett J.; Hugenholtz, Philip; Dawson, Scott C.; Banfield, Jillian F.

    2003-01-01

    During a molecular phylogenetic survey of extremely acidic (pH < 1), metal-rich acid mine drainage habitats in the Richmond Mine at Iron Mountain, Calif., we detected 16S rRNA gene sequences of a novel bacterial group belonging to the order Rickettsiales in the Alphaproteobacteria. The closest known relatives of this group (92% 16S rRNA gene sequence identity) are endosymbionts of the protist Acanthamoeba. Oligonucleotide 16S rRNA probes were designed and used to observe members of this group within acidophilic protists. To improve visualization of eukaryotic populations in the acid mine drainage samples, broad-specificity probes for eukaryotes were redesigned and combined to highlight this component of the acid mine drainage community. Approximately 4% of protists in the acid mine drainage samples contained endosymbionts. Measurements of internal pH of the protists showed that their cytosol is close to neutral, indicating that the endosymbionts may be neutrophilic. The endosymbionts had a conserved 273-nucleotide intervening sequence (IVS) in variable region V1 of their 16S rRNA genes. The IVS does not match any sequence in current databases, but the predicted secondary structure forms well-defined stem loops. IVSs are uncommon in rRNA genes and appear to be confined to bacteria living in close association with eukaryotes. Based on the phylogenetic novelty of the endosymbiont sequences and initial culture-independent characterization, we propose the name “Candidatus Captivus acidiprotistae.” To our knowledge, this is the first report of an endosymbiotic relationship in an extremely acidic habitat. PMID:12957940

  12. Genomic location of the major ribosomal protein gene locus determines Vibrio cholerae global growth and infectivity.

    PubMed

    Soler-Bistué, Alfonso; Mondotte, Juan A; Bland, Michael Jason; Val, Marie-Eve; Saleh, María-Carla; Mazel, Didier

    2015-04-01

    The effects on cell physiology of gene order within the bacterial chromosome are poorly understood. In silico approaches have shown that genes involved in transcription and translation processes, in particular ribosomal protein (RP) genes, localize near the replication origin (oriC) in fast-growing bacteria suggesting that such a positional bias is an evolutionarily conserved growth-optimization strategy. Such genomic localization could either provide a higher dosage of these genes during fast growth or facilitate the assembly of ribosomes and transcription foci by keeping physically close the many components of these macromolecular machines. To explore this, we used novel recombineering tools to create a set of Vibrio cholerae strains in which S10-spec-α (S10), a locus bearing half of the ribosomal protein genes, was systematically relocated to alternative genomic positions. We show that the relative distance of S10 to the origin of replication tightly correlated with a reduction of S10 dosage, mRNA abundance and growth rate within these otherwise isogenic strains. Furthermore, this was accompanied by a significant reduction in the host-invasion capacity in Drosophila melanogaster. Both phenotypes were rescued in strains bearing two S10 copies highly distal to oriC, demonstrating that replication-dependent gene dosage reduction is the main mechanism behind these alterations. Hence, S10 positioning connects genome structure to cell physiology in Vibrio cholerae. Our results show experimentally for the first time that genomic positioning of genes involved in the flux of genetic information conditions global growth control and hence bacterial physiology and potentially its evolution. PMID:25875621

  13. Global gene expression analysis of the shoot apical meristem of maize (Zea mays L.)

    PubMed Central

    Ohtsu, Kazuhiro; Smith, Marianne B; Emrich, Scott J; Borsuk, Lisa A; Zhou, Ruilian; Chen, Tianle; Zhang, Xiaolan; Timmermans, Marja C P; Beck, Jon; Buckner, Brent; Janick-Buckner, Diane; Nettleton, Dan; Scanlon, Michael J; Schnable, Patrick S

    2007-01-01

    All above-ground plant organs are derived from shoot apical meristems (SAMs). Global analyses of gene expression were conducted on maize (Zea mays L.) SAMs to identify genes preferentially expressed in the SAM. The SAMs were collected from 14-day-old B73 seedlings via laser capture microdissection (LCM). The RNA samples extracted from LCM-collected SAMs and from seedlings were hybridized to microarrays spotted with 37 660 maize cDNAs. Approximately 30% (10 816) of these cDNAs were prepared as part of this study from manually dissected B73 maize apices. Over 5000 expressed sequence tags (ESTs) (about 13% of the total) were differentially expressed (P<0.0001) between SAMs and seedlings. Of these, 2783 and 2248 ESTs were up- and down-regulated in the SAM, respectively. The expression in the SAM of several of the differentially expressed ESTs was validated via quantitative RT-PCR and/or in situ hybridization. The up-regulated ESTs included many regulatory genes including transcription factors, chromatin remodeling factors and components of the gene-silencing machinery, as well as about 900 genes with unknown functions. Surprisingly, transcripts that hybridized to 62 retrotransposon-related cDNAs were also substantially up-regulated in the SAM. Complementary DNAs derived from the LCM-collected SAMs were sequenced to identify additional genes that are expressed in the SAM. This generated around 550 000 ESTs (454-SAM ESTs) from two genotypes. Consistent with the microarray results, approximately 14% of the 454-SAM ESTs from B73 were retrotransposon-related. Possible roles of genes that are preferentially expressed in the SAM are discussed. PMID:17764504

  14. Global characterization of interferon regulatory factor (IRF) genes in vertebrates: Glimpse of the diversification in evolution

    PubMed Central

    2010-01-01

    Background Interferon regulatory factors (IRFs), which can be identified based on a unique helix-turn-helix DNA-binding domain (DBD) are a large family of transcription factors involved in host immune response, haemotopoietic differentiation and immunomodulation. Despite the identification of ten IRF family members in mammals, and some recent effort to identify these members in fish, relatively little is known in the composition of these members in other classes of vertebrates, and the evolution and probably the origin of the IRF family have not been investigated in vertebrates. Results Genome data mining has been performed to identify any possible IRF family members in human, mouse, dog, chicken, anole lizard, frog, and some teleost fish, mainly zebrafish and stickleback, and also in non-vertebrate deuterostomes including the hemichordate, cephalochordate, urochordate and echinoderm. In vertebrates, all ten IRF family members, i.e. IRF-1 to IRF-10 were identified, with two genes of IRF-4 and IRF-6 identified in fish and frog, respectively, except that in zebrafish exist three IRF-4 genes. Surprisingly, an additional member in the IRF family, IRF-11 was found in teleost fish. A range of two to ten IRF-like genes were detected in the non-vertebrate deuterostomes, and they had little similarity to those IRF family members in vertebrates as revealed in genomic structure and in phylogenetic analysis. However, the ten IRF family members, IRF-1 to IRF-10 showed certain degrees of conservation in terms of genomic structure and gene synteny. In particular, IRF-1, IRF-2, IRF-6, IRF-8 are quite conserved in their genomic structure in all vertebrates, and to a less degree, some IRF family members, such as IRF-5 and IRF-9 are comparable in the structure. Synteny analysis revealed that the gene loci for the ten IRF family members in vertebrates were also quite conservative, but in zebrafish conserved genes were distributed in a much longer distance in chromosomes. Furthermore

  15. Mining from transcriptomes: 315 single-copy orthologous genes concatenated for the phylogenetic analyses of Orchidaceae.

    PubMed

    Deng, Hua; Zhang, Guo-Qiang; Lin, Min; Wang, Yan; Liu, Zhong-Jian

    2015-09-01

    Phylogenetic relationships are hotspots for orchid studies with controversial standpoints. Traditionally, the phylogenies of orchids are based on morphology and subjective factors. Although more reliable than classic phylogenic analyses, the current methods are based on a few gene markers and PCR amplification, which are labor intensive and cannot identify the placement of some species with degenerated plastid genomes. Therefore, a more efficient, labor-saving and reliable method is needed for phylogenic analysis. Here, we present a method of orchid phylogeny construction using transcriptomes. Ten representative species covering five subfamilies of Orchidaceae were selected, and 315 single-copy orthologous genes extracted from the transcriptomes of these organisms were applied to reconstruct a more robust phylogeny of orchids. This approach provided a rapid and reliable method of phylogeny construction for Orchidaceae, one of the most diversified family of angiosperms. We also showed the rigorous systematic position of holomycotrophic species, which has previously been difficult to determine because of the degenerated plastid genome. We concluded that the method presented in this study is more efficient and reliable than methods based on a few gene markers for phylogenic analyses, especially for the holomycotrophic species or those whose DNA sequences have been difficult to amplify. Meanwhile, a total of 315 single-copy orthologous genes of orchids are offered and more informative loci could be used in the future orchid phylogenetic studies. PMID:26380706

  16. Mining from transcriptomes: 315 single-copy orthologous genes concatenated for the phylogenetic analyses of Orchidaceae

    PubMed Central

    Deng, Hua; Zhang, Guo-Qiang; Lin, Min; Wang, Yan; Liu, Zhong-Jian

    2015-01-01

    Phylogenetic relationships are hotspots for orchid studies with controversial standpoints. Traditionally, the phylogenies of orchids are based on morphology and subjective factors. Although more reliable than classic phylogenic analyses, the current methods are based on a few gene markers and PCR amplification, which are labor intensive and cannot identify the placement of some species with degenerated plastid genomes. Therefore, a more efficient, labor-saving and reliable method is needed for phylogenic analysis. Here, we present a method of orchid phylogeny construction using transcriptomes. Ten representative species covering five subfamilies of Orchidaceae were selected, and 315 single-copy orthologous genes extracted from the transcriptomes of these organisms were applied to reconstruct a more robust phylogeny of orchids. This approach provided a rapid and reliable method of phylogeny construction for Orchidaceae, one of the most diversified family of angiosperms. We also showed the rigorous systematic position of holomycotrophic species, which has previously been difficult to determine because of the degenerated plastid genome. We concluded that the method presented in this study is more efficient and reliable than methods based on a few gene markers for phylogenic analyses, especially for the holomycotrophic species or those whose DNA sequences have been difficult to amplify. Meanwhile, a total of 315 single-copy orthologous genes of orchids are offered and more informative loci could be used in the future orchid phylogenetic studies. PMID:26380706

  17. Heme Signaling Impacts Global Gene Expression, Immunity and Dengue Virus Infectivity in Aedes aegypti

    PubMed Central

    Bottino-Rojas, Vanessa; Talyuli, Octávio A. C.; Jupatanakul, Natapong; Sim, Shuzhen; Dimopoulos, George; Venancio, Thiago M.; Bahia, Ana C.; Sorgine, Marcos H.; Oliveira, Pedro L.; Paiva-Silva, Gabriela O.

    2015-01-01

    Blood-feeding mosquitoes are exposed to high levels of heme, the product of hemoglobin degradation. Heme is a pro-oxidant that influences a variety of cellular processes. We performed a global analysis of heme-regulated Aedes aegypti (yellow fever mosquito) transcriptional changes to better understand influence on mosquito physiology at the molecular level. We observed an iron- and reactive oxygen species (ROS)-independent signaling induced by heme that comprised genes related to redox metabolism. By modulating the abundance of these transcripts, heme possibly acts as a danger signaling molecule. Furthermore, heme triggered critical changes in the expression of energy metabolism and immune response genes, altering the susceptibility towards bacteria and dengue virus. These findings seem to have implications on the adaptation of mosquitoes to hematophagy and consequently on their ability to transmit diseases. Altogether, these results may also contribute to the understanding of heme cell biology in eukaryotic cells. PMID:26275150

  18. LeuO is a global regulator of gene expression in Salmonella enterica serovar Typhimurium.

    PubMed

    Dillon, Shane C; Espinosa, Elena; Hokamp, Karsten; Ussery, David W; Casadesús, Josep; Dorman, Charles J

    2012-09-01

    We report the first investigation of the binding of the Salmonella enterica LeuO LysR-type transcription regulator to its genomic targets in vivo. Chromatin-immunoprecipitation-on-chip identified 178 LeuO binding sites on the chromosome of S. enterica serovar Typhimurium strain SL1344. These sites were distributed across both the core and the horizontally acquired genome, and included housekeeping genes and genes known to contribute to virulence. Sixty-eight LeuO targets were co-bound by the global repressor protein, H-NS. Thus, while LeuO may function as an H-NS antagonist, these functions are unlikely to involve displacement of H-NS. RNA polymerase bound 173 of the 178 LeuO targets, consistent with LeuO being a transcription regulator. Thus, LeuO targets two classes of genes, those that are bound by H-NS and those that are not bound by H-NS. LeuO binding site analysis revealed a logo conforming to the TN(11) A motif common to LysR-type transcription factors. It differed in some details from a motif that we composed for Escherichia coli LeuO binding sites; 1263 and 1094 LeuO binding site locations were predicted in the S. Typhimurium SL1344 and E. coli MG1655 genomes respectively. Despite differences in motif composition, many LeuO target genes were common to both species. Thus, LeuO is likely to be a more important global regulator than previously suspected. PMID:22804842

  19. Global Gene Expression Profiling through the Complete Life Cycle of Trypanosoma vivax

    PubMed Central

    Jackson, Andrew P.; Goyard, Sophie; Xia, Dong; Foth, Bernardo J.; Sanders, Mandy; Wastling, Jonathan M.; Minoprio, Paola; Berriman, Matthew

    2015-01-01

    The parasitic flagellate Trypanosoma vivax is a cause of animal trypanosomiasis across Africa and South America. The parasite has a digenetic life cycle, passing between mammalian hosts and insect vectors, and a series of developmental forms adapted to each life cycle stage. Each point in the life cycle presents radically different challenges to parasite metabolism and physiology and distinct host interactions requiring remodeling of the parasite cell surface. Transcriptomic and proteomic studies of the related parasites T. brucei and T. congolense have shown how gene expression is regulated during their development. New methods for in vitro culture of the T. vivax insect stages have allowed us to describe global gene expression throughout the complete T. vivax life cycle for the first time. We combined transcriptomic and proteomic analysis of each life stage using RNA-seq and mass spectrometry respectively, to identify genes with patterns of preferential transcription or expression. While T. vivax conforms to a pattern of highly conserved gene expression found in other African trypanosomes, (e.g. developmental regulation of energy metabolism, restricted expression of a dominant variant antigen, and expression of ‘Fam50’ proteins in the insect mouthparts), we identified significant differences in gene expression affecting metabolism in the fly and a suite of T. vivax-specific genes with predicted cell-surface expression that are preferentially expressed in the mammal (‘Fam29, 30, 42’) or the vector (‘Fam34, 35, 43’). T. vivax differs significantly from other African trypanosomes in the developmentally-regulated proteins likely to be expressed on its cell surface and thus, in the structure of the host-parasite interface. These unique features may yet explain the species differences in life cycle and could, in the form of bloodstream-stage proteins that do not undergo antigenic variation, provide targets for therapy. PMID:26266535

  20. Role of Global and Local Topology in the Regulation of Gene Expression in Streptococcus pneumoniae

    PubMed Central

    Ferrándiz, María-José; Arnanz, Cristina; Martín-Galiano, Antonio J.; Rodríguez-Martín, Carlos; de la Campa, Adela G.

    2014-01-01

    The most basic level of transcription regulation in Streptococcus pneumoniae is the organization of its chromosome in topological domains. In response to drugs that caused DNA-relaxation, a global transcriptional response was observed. Several chromosomal domains were identified based on the transcriptional response of their genes: up-regulated (U), down-regulated (D), non-regulated (N), and flanking (F). We show that these distinct domains have different expression and conservation characteristics. Microarray fluorescence units under non-relaxation conditions were used as a measure of gene transcriptional level. Fluorescence units were significantly lower in F genes than in the other domains with a similar AT content. The transcriptional level of the domains categorized them was D>U>F. In addition, a comparison of 12 S. pneumoniae genome sequences showed a conservation of gene composition within U and D domains, and an extensive gene interchange in F domains. We tested the organization of chromosomal domains by measuring the relaxation-mediated transcription of eight insertions of a heterologous Ptccat cassette, two in each type of domain, showing that transcription depended on their chromosomal location. Moreover, transcription from the four promoters directing the five genes involved in supercoiling homeostasis, located either in U (gyrB), D (topA), or N (gyrA and parEC) domains was analyzed both in their chromosomal locations and in a replicating plasmid. Although expression from the chromosomal PgyrB and PtopA showed the expected domain regulation, their expression was down-regulated in the plasmid, which behaved as a D domain. However, both PparE and PgyrA carried their own regulatory signals, their topology-dependent expression being equivalent in the plasmid or in the chromosome. In PgyrA a DNA bend acted as a DNA supercoiling sensor. These results revealed that DNA topology functions as a general transcriptional regulator, superimposed upon other more

  1. Global Analysis of Serine-Threonine Protein Kinase Genes in Neurospora crassa ▿ †

    PubMed Central

    Park, Gyungsoon; Servin, Jacqueline A.; Turner, Gloria E.; Altamirano, Lorena; Colot, Hildur V.; Collopy, Patrick; Litvinkova, Liubov; Li, Liande; Jones, Carol A.; Diala, Fitz-Gerald; Dunlap, Jay C.; Borkovich, Katherine A.

    2011-01-01

    Serine/threonine (S/T) protein kinases are crucial components of diverse signaling pathways in eukaryotes, including the model filamentous fungus Neurospora crassa. In order to assess the importance of S/T kinases to Neurospora biology, we embarked on a global analysis of 86 S/T kinase genes in Neurospora. We were able to isolate viable mutants for 77 of the 86 kinase genes. Of these, 57% exhibited at least one growth or developmental phenotype, with a relatively large fraction (40%) possessing a defect in more than one trait. S/T kinase knockouts were subjected to chemical screening using a panel of eight chemical treatments, with 25 mutants exhibiting sensitivity or resistance to at least one chemical. This brought the total percentage of S/T mutants with phenotypes in our study to 71%. Mutants lacking apg-1, an S/T kinase required for autophagy in other organisms, possessed the greatest number of phenotypes, with defects in asexual and sexual growth and development and in altered sensitivity to five chemical treatments. We showed that NCU02245/stk-19 is required for chemotropic interactions between female and male cells during mating. Finally, we demonstrated allelism between the S/T kinase gene NCU00406 and velvet (vel), encoding a p21-activated protein kinase (PAK) gene important for asexual and sexual growth and development in Neurospora. PMID:21965514

  2. Global analysis of serine-threonine protein kinase genes in Neurospora crassa.

    PubMed

    Park, Gyungsoon; Servin, Jacqueline A; Turner, Gloria E; Altamirano, Lorena; Colot, Hildur V; Collopy, Patrick; Litvinkova, Liubov; Li, Liande; Jones, Carol A; Diala, Fitz-Gerald; Dunlap, Jay C; Borkovich, Katherine A

    2011-11-01

    Serine/threonine (S/T) protein kinases are crucial components of diverse signaling pathways in eukaryotes, including the model filamentous fungus Neurospora crassa. In order to assess the importance of S/T kinases to Neurospora biology, we embarked on a global analysis of 86 S/T kinase genes in Neurospora. We were able to isolate viable mutants for 77 of the 86 kinase genes. Of these, 57% exhibited at least one growth or developmental phenotype, with a relatively large fraction (40%) possessing a defect in more than one trait. S/T kinase knockouts were subjected to chemical screening using a panel of eight chemical treatments, with 25 mutants exhibiting sensitivity or resistance to at least one chemical. This brought the total percentage of S/T mutants with phenotypes in our study to 71%. Mutants lacking apg-1, an S/T kinase required for autophagy in other organisms, possessed the greatest number of phenotypes, with defects in asexual and sexual growth and development and in altered sensitivity to five chemical treatments. We showed that NCU02245/stk-19 is required for chemotropic interactions between female and male cells during mating. Finally, we demonstrated allelism between the S/T kinase gene NCU00406 and velvet (vel), encoding a p21-activated protein kinase (PAK) gene important for asexual and sexual growth and development in Neurospora. PMID:21965514

  3. Gene expression during the first 28 days of axolotl limb regeneration I: Experimental design and global analysis of gene expression

    PubMed Central

    Palumbo, Alex; Nagarajan, Radha; Gardiner, David M.; Muneoka, Ken; Stromberg, Arnold J.; Athippozhy, Antony T.

    2015-01-01

    Abstract While it is appreciated that global gene expression analyses can provide novel insights about complex biological processes, experiments are generally insufficiently powered to achieve this goal. Here we report the results of a robust microarray experiment of axolotl forelimb regeneration. At each of 20 post‐amputation time points, we estimated gene expression for 10 replicate RNA samples that were isolated from 1 mm of heterogeneous tissue collected from the distal limb tip. We show that the limb transcription program diverges progressively with time from the non‐injured state, and divergence among time adjacent samples is mostly gradual. However, punctuated episodes of transcription were identified for five intervals of time, with four of these coinciding with well‐described stages of limb regeneration—amputation, early bud, late bud, and pallet. The results suggest that regeneration is highly temporally structured and regulated by mechanisms that function within narrow windows of time to coordinate transcription within and across cell types of the regenerating limb. Our results provide an integrative framework for hypothesis generation using this complex and highly informative data set. PMID:27168937

  4. Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies?

    PubMed

    Winnenburg, Rainer; Wächter, Thomas; Plake, Conrad; Doms, Andreas; Schroeder, Michael

    2008-11-01

    The biomedical literature can be seen as a large integrated, but unstructured data repository. Extracting facts from literature and making them accessible is approached from two directions: manual curation efforts develop ontologies and vocabularies to annotate gene products based on statements in papers. Text mining aims to automatically identify entities and their relationships in text using information retrieval and natural language processing techniques. Manual curation is highly accurate but time consuming, and does not scale with the ever increasing growth of literature. Text mining as a high-throughput computational technique scales well, but is error-prone due to the complexity of natural language. How can both be married to combine scalability and accuracy? Here, we review the state-of-the-art text mining approaches that are relevant to annotation and discuss available online services analysing biomedical literature by means of text mining techniques, which could also be utilised by annotation projects. We then examine how far text mining has already been utilised in existing annotation projects and conclude how these techniques could be tightly integrated into the manual annotation process through novel authoring systems to scale-up high-quality manual curation. PMID:19060303

  5. Mining Predicted Essential Genes of Brugia malayi for Nematode Drug Targets

    PubMed Central

    Kumar, Sanjay; Chaudhary, Kshitiz; Foster, Jeremy M.; Novelli, Jacopo F.; Zhang, Yinhua; Wang, Shiliang; Spiro, David; Ghedin, Elodie; Carlow, Clotilde K. S.

    2007-01-01

    We report results from the first genome-wide application of a rational drug target selection methodology to a metazoan pathogen genome, the completed draft sequence of Brugia malayi, a parasitic nematode responsible for human lymphatic filariasis. More than 1.5 billion people worldwide are at risk of contracting lymphatic filariasis and onchocerciasis, a related filarial disease. Drug treatments for filariasis have not changed significantly in over 20 years, and with the risk of resistance rising, there is an urgent need for the development of new anti-filarial drug therapies. The recent publication of the draft genomic sequence for B. malayi enables a genome-wide search for new drug targets. However, there is no functional genomics data in B. malayi to guide the selection of potential drug targets. To circumvent this problem, we have utilized the free-living model nematode Caenorhabditis elegans as a surrogate for B. malayi. Sequence comparisons between the two genomes allow us to map C. elegans orthologs to B. malayi genes. Using these orthology mappings and by incorporating the extensive genomic and functional genomic data, including genome-wide RNAi screens, that already exist for C. elegans, we identify potentially essential genes in B. malayi. Further incorporation of human host genome sequence data and a custom algorithm for prioritization enables us to collect and rank nearly 600 drug target candidates. Previously identified potential drug targets cluster near the top of our prioritized list, lending credibility to our methodology. Over-represented Gene Ontology terms, predicted InterPro domains, and RNAi phenotypes of C. elegans orthologs associated with the potential target pool are identified. By virtue of the selection procedure, the potential B. malayi drug targets highlight components of key processes in nematode biology such as central metabolism, molting and regulation of gene expression. PMID:18000556

  6. Regulation of Global Gene Expression in Human Loa loa Infection Is a Function of Chronicity

    PubMed Central

    Steel, Cathy; Varma, Sudhir; Nutman, Thomas B.

    2012-01-01

    Background Human filarial infection is characterized by downregulated parasite-antigen specific T cell responses but distinct differences exist between patients with longstanding infection (endemics) and those who acquired infection through temporary residency or visits to filarial-endemic regions (expatriates). Methods and Findings To characterize mechanisms underlying differences in T cells, analysis of global gene expression using human spotted microarrays was conducted on CD4+ and CD8+ T cells from microfilaremic Loa loa-infected endemic and expatriate patients. Assessment of unstimulated cells showed overexpression of genes linked to inflammation and caspase-associated cell death, particularly in endemics, and enrichment of the Th1/Th2 canonical pathway in endemic CD4+ cells. However, pathways within CD8+ unstimulated cells were most significantly enriched in both patient groups. Antigen (Ag)-driven gene expression was assessed to microfilarial Ag (MfAg) and to the nonparasite Ag streptolysin O (SLO). For MfAg-driven cells, the number of genes differing significantly from unstimulated cells was greater in endemics compared to expatriates (p<0.0001). Functional analysis showed a differential increase in genes associated with NFkB (both groups) and caspase activation (endemics). While the expatriate response to MfAg was primarily a CD4+ pro-inflammatory one, the endemic response included CD4+ and CD8+ cells and was linked to insulin signaling, histone complexes, and ubiquitination. Unlike the enrichment of canonical pathways in CD8+ unstimulated cells, both groups showed pathway enrichment in CD4+ cells to MfAg. Contrasting with the divergent responses to MfAg seen between endemics and expatriates, the CD4+ response to SLO was similar; however, CD8+ cells differed strongly in the nature and numbers (156 [endemics] vs 36 [expatriates]) of genes with differential expression. Conclusions These data suggest several important pathways are responsible for the

  7. Global Occurrence of Archaeal amoA Genes in Terrestrial Hot Springs▿

    PubMed Central

    Zhang, Chuanlun L.; Ye, Qi; Huang, Zhiyong; Li, WenJun; Chen, Jinquan; Song, Zhaoqi; Zhao, Weidong; Bagwell, Christopher; Inskeep, William P.; Ross, Christian; Gao, Lei; Wiegel, Juergen; Romanek, Christopher S.; Shock, Everett L.; Hedlund, Brian P.

    2008-01-01

    transcribed in situ in one spring and the transcripts were closely related to the amoA genes amplified from the same spring. Our study demonstrates the global occurrence of putative archaeal amoA genes in a wide variety of terrestrial hot springs and suggests that geography may play an important role in selecting different assemblages of AOA. PMID:18676703

  8. Global gene expression profiles of ischemic preconditioning in deceased donor liver transplantation.

    PubMed

    Raza, Ali; Dikdan, George; Desai, Kunj K; Shareef, Asif; Fernandes, Helen; Aris, Virginie; de la Torre, Andrew N; Wilson, Dorian; Fisher, Adrian; Soteropoulos, Patricia; Koneru, Baburao

    2010-05-01

    The benefits of ischemic preconditioning (IPC) in reducing ischemia/reperfusion injury (IRI) remain indistinct in human liver transplantation (LT). To further understand mechanistic aspects of IPC, we performed microarray analyses as a nested substudy in a randomized trial of 10-minute IPC in 101 deceased donor LTs. Liver biopsies were performed after cold storage and at 90 minutes postreperfusion in 40 of 101 subjects. Global gene expression profiles in 6 biopsy pairs in IPC and work standard organ recovery groups at both time points were compared using the Affymetrix GeneChip Human Gene 1.0 ST array. Transcripts with >1.5-fold change and P < 0.05 were considered significant. IPC altered expression of 82 transcripts in antioxidant, immunological, lipid biosynthesis, cell development and growth, and other groups. Real-time polymerase chain reaction and immunoblotting validated our microarray data. IPC-induced overexpression of glutathione S-transferase mu transcripts (GSTM1, GSTM3, GSTM4, and GSTM5) was accompanied by increased protein expression and may contribute to a decrease in oxidative stress. However, the increased expression of fatty acid synthase may increase oxidative stress, and tumor necrosis factor ligand superfamily member 10 may promote apoptosis. These changes, in combination with decreased expression of heparin-binding epidermal growth factor-like growth factor and insulin-like growth factor binding protein-1, both of which inhibit apoptosis, may increase IRI. In our study of deceased donor LT, IPC induces changes in gene expression, some of which are potentially beneficial but some which are potentially injurious. Thus, our findings of changes in gene expression mirror the outcomes in our clinical trial. PMID:20440768

  9. Global Analysis of Gene Expression Profiles in Developing Physic Nut (Jatropha curcas L.) Seeds

    PubMed Central

    Jiang, Huawu; Wu, Pingzhi; Zhang, Sheng; Song, Chi; Chen, Yaping; Li, Meiru; Jia, Yongxia; Fang, Xiaohua; Chen, Fan; Wu, Guojiang

    2012-01-01

    Background Physic nut (Jatropha curcas L.) is an oilseed plant species with high potential utility as a biofuel. Furthermore, following recent sequencing of its genome and the availability of expressed sequence tag (EST) libraries, it is a valuable model plant for studying carbon assimilation in endosperms of oilseed plants. There have been several transcriptomic analyses of developing physic nut seeds using ESTs, but they have provided limited information on the accumulation of stored resources in the seeds. Methodology/Principal Findings We applied next-generation Illumina sequencing technology to analyze global gene expression profiles of developing physic nut seeds 14, 19, 25, 29, 35, 41, and 45 days after pollination (DAP). The acquired profiles reveal the key genes, and their expression timeframes, involved in major metabolic processes including: carbon flow, starch metabolism, and synthesis of storage lipids and proteins in the developing seeds. The main period of storage reserves synthesis in the seeds appears to be 29–41 DAP, and the fatty acid composition of the developing seeds is consistent with relative expression levels of different isoforms of acyl-ACP thioesterase and fatty acid desaturase genes. Several transcription factor genes whose expression coincides with storage reserve deposition correspond to those known to regulate the process in Arabidopsis. Conclusions/Significance The results will facilitate searches for genes that influence de novo lipid synthesis, accumulation and their regulatory networks in developing physic nut seeds, and other oil seeds. Thus, they will be helpful in attempts to modify these plants for efficient biofuel production. PMID:22574177

  10. Triterpenoid Saponin Biosynthetic Pathway Profiling and Candidate Gene Mining of the Ilex asprella Root Using RNA-Seq

    PubMed Central

    Zheng, Xiasheng; Xu, Hui; Ma, Xinye; Zhan, Ruoting; Chen, Weiwen

    2014-01-01

    Ilex asprella, which contains abundant α-amyrin type triterpenoid saponins, is an anti-influenza herbal drug widely used in south China. In this work, we first analysed the transcriptome of the I. asprella root using RNA-Seq, which provided a dataset for functional gene mining. mRNA was isolated from the total RNA of the I. asprella root and reverse-transcribed into cDNA. Then, the cDNA library was sequenced using an Illumina HiSeq™ 2000, which generated 55,028,452 clean reads. De novo assembly of these reads generated 51,865 unigenes, in which 39,269 unigenes were annotated (75.71% yield). According to the structures of the triterpenoid saponins of I. asprella, a putative biosynthetic pathway downstream of 2,3-oxidosqualene was proposed and candidate unigenes in the transcriptome data that were potentially involved in the pathway were screened using homology-based BLAST and phylogenetic analysis. Further amplification and functional analysis of these putative unigenes will provide insight into the biosynthesis of Ilex triterpenoid saponins. PMID:24722569