Science.gov

Sample records for addition bioinformatics analysis

  1. Bioinformatic analysis of proteomics data

    PubMed Central

    2014-01-01

    Most biochemical reactions in a cell are regulated by highly specialized proteins, which are the prime mediators of the cellular phenotype. Therefore the identification, quantitation and characterization of all proteins in a cell are of utmost importance to understand the molecular processes that mediate cellular physiology. With the advent of robust and reliable mass spectrometers that are able to analyze complex protein mixtures within a reasonable timeframe, the systematic analysis of all proteins in a cell becomes feasible. Besides the ongoing improvements of analytical hardware, standardized methods to analyze and study all proteins have to be developed that allow the generation of testable new hypothesis based on the enormous pre-existing amount of biological information. Here we discuss current strategies on how to gather, filter and analyze proteomic data sates using available software packages. PMID:25033288

  2. Bioinformatics strategies for the analysis of lipids.

    PubMed

    Wheelock, Craig E; Goto, Susumu; Yetukuri, Laxman; D'Alexandri, Fabio Luiz; Klukas, Christian; Schreiber, Falk; Oresic, Matej

    2009-01-01

    Owing to their importance in cellular physiology and pathology as well as to recent technological advances, the study of lipids has reemerged as a major research target. However, the structural diversity of lipids presents a number of analytical and informatics challenges. The field of lipidomics is a new postgenome discipline that aims to develop comprehensive methods for lipid analysis, necessitating concomitant developments in bioinformatics. The evolving research paradigm requires that new bioinformatics approaches accommodate genomic as well as high-level perspectives, integrating genome, protein, chemical and network information. The incorporation of lipidomics information into these data structures will provide mechanistic understanding of lipid functions and interactions in the context of cellular and organismal physiology. Accordingly, it is vital that specific bioinformatics methods be developed to analyze the wealth of lipid data being acquired. Herein, we present an overview of the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and application of its tools to the analysis of lipid data. We also describe a series of software tools and databases (KGML-ED, VANTED, MZmine, and LipidDB) that can be used for the processing of lipidomics data and biochemical pathway reconstruction, an important next step in the development of the lipidomics field.

  3. Bioinformatics for analysis of poxvirus genomes.

    PubMed

    Da Silva, Melissa; Upton, Chris

    2012-01-01

    In recent years, there have been numerous unprecedented technological advances in the field of molecular biology; these include DNA sequencing, mass spectrometry of proteins, and microarray analysis of mRNA transcripts. Perhaps, however, it is the area of genomics, which has now generated the complete genome sequences of more than 100 poxviruses, that has had the greatest impact on the average virology researcher because the DNA sequence data is in constant use in many different ways by almost all molecular virologists. As this data resource grows, so does the importance of the availability of databases and software tools to enable the bench virologist to work with and make use of this (valuable/expensive) DNA sequence information. Thus, providing researchers with intuitive software to first select and reformat genomics data from large databases, second, to compare/analyze genomics data, and third, to view and interpret large and complex sets of results has become pivotal in enabling progress to be made in modern virology. This chapter is directed at the bench virologist and describes the software required for a number of common bioinformatics techniques that are useful for comparing and analyzing poxvirus genomes. In a number of examples, we also highlight the Viral Orthologous Clusters database system and integrated tools that we developed for the management and analysis of complete viral genomes.

  4. Bioinformatics Pipeline for Transcriptome Sequencing Analysis.

    PubMed

    Djebali, Sarah; Wucher, Valentin; Foissac, Sylvain; Hitte, Christophe; Corre, Evan; Derrien, Thomas

    2017-01-01

    The development of High Throughput Sequencing (HTS) for RNA profiling (RNA-seq) has shed light on the diversity of transcriptomes. While RNA-seq is becoming a de facto standard for monitoring the population of expressed transcripts in a given condition at a specific time, processing the huge amount of data it generates requires dedicated bioinformatics programs. Here, we describe a standard bioinformatics protocol using state-of-the-art tools, the STAR mapper to align reads onto a reference genome, Cufflinks to reconstruct the transcriptome, and RSEM to quantify expression levels of genes and transcripts. We present the workflow using human transcriptome sequencing data from two biological replicates of the K562 cell line produced as part of the ENCODE3 project. PMID:27662878

  5. Bioinformatics analysis of circulating cell-free DNA sequencing data.

    PubMed

    Chan, Landon L; Jiang, Peiyong

    2015-10-01

    The discovery of cell-free DNA molecules in plasma has opened up numerous opportunities in noninvasive diagnosis. Cell-free DNA molecules have become increasingly recognized as promising biomarkers for detection and management of many diseases. The advent of next generation sequencing has provided unprecedented opportunities to scrutinize the characteristics of cell-free DNA molecules in plasma in a genome-wide fashion and at single-base resolution. Consequently, clinical applications of circulating cell-free DNA analysis have not only revolutionized noninvasive prenatal diagnosis but also facilitated cancer detection and monitoring toward an era of blood-based personalized medicine. With the remarkably increasing throughput and lowering cost of next generation sequencing, bioinformatics analysis becomes increasingly demanding to understand the large amount of data generated by these sequencing platforms. In this Review, we highlight the major bioinformatics algorithms involved in the analysis of cell-free DNA sequencing data. Firstly, we briefly describe the biological properties of these molecules and provide an overview of the general bioinformatics approach for the analysis of cell-free DNA. Then, we discuss the specific upstream bioinformatics considerations concerning the analysis of sequencing data of circulating cell-free DNA, followed by further detailed elaboration on each key clinical situation in noninvasive prenatal diagnosis and cancer management where downstream bioinformatics analysis is heavily involved. We also discuss bioinformatics analysis as well as clinical applications of the newly developed massively parallel bisulfite sequencing of cell-free DNA. Finally, we offer our perspectives on the future development of bioinformatics in noninvasive diagnosis.

  6. Bioinformatics Analysis of Estrogen-Responsive Genes.

    PubMed

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  7. Design and bioinformatics analysis of genome-wide CLIP experiments

    PubMed Central

    Wang, Tao; Xiao, Guanghua; Chu, Yongjun; Zhang, Michael Q.; Corey, David R.; Xie, Yang

    2015-01-01

    The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP–RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses. PMID:25958398

  8. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  9. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  10. Biochip microsystem for bioinformatics recognition and analysis

    NASA Technical Reports Server (NTRS)

    Lue, Jaw-Chyng (Inventor); Fang, Wai-Chi (Inventor)

    2011-01-01

    A system with applications in pattern recognition, or classification, of DNA assay samples. Because DNA reference and sample material in wells of an assay may be caused to fluoresce depending upon dye added to the material, the resulting light may be imaged onto an embodiment comprising an array of photodetectors and an adaptive neural network, with applications to DNA analysis. Other embodiments are described and claimed.

  11. The MPI Bioinformatics Toolkit for protein sequence analysis

    PubMed Central

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N.

    2006-01-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at . PMID:16845021

  12. The MPI Bioinformatics Toolkit for protein sequence analysis.

    PubMed

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N

    2006-07-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at http://toolkit.tuebingen.mpg.de.

  13. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN

    PubMed Central

    2010-01-01

    Background Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. Results VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Conclusions Bioinformatics curation and ontological

  14. Bioinformatic Analysis of HIV-1 Entry and Pathogenesis

    PubMed Central

    Aiamkitsumrit, Benjamas; Dampier, Will; Antell, Gregory; Rivera, Nina; Martin-Garcia, Julio; Pirrone, Vanessa; Nonnemacher, Michael R.; Wigdahl, Brian

    2015-01-01

    The evolution of human immunodeficiency virus type 1 (HIV-1) with respect to co-receptor utilization has been shown to be relevant to HIV-1 pathogenesis and disease. The CCR5-utilizing (R5) virus has been shown to be important in the very early stages of transmission and highly prevalent during asymptomatic infection and chronic disease. In addition, the R5 virus has been proposed to be involved in neuroinvasion and central nervous system (CNS) disease. In contrast, the CXCR4-utilizing (X4) virus is more prevalent during the course of disease progression and concurrent with the loss of CD4+ T cells. The dual-tropic virus is able to utilize both co-receptors (CXCR4 and CCR5) and has been thought to represent an intermediate transitional virus that possesses properties of both X4 and R5 viruses that can be encountered at many stages of disease. The use of computational tools and bioinformatic approaches in the prediction of HIV-1 co-receptor usage has been growing in importance with respect to understanding HIV-1 pathogenesis and disease, developing diagnostic tools, and improving the efficacy of therapeutic strategies focused on blocking viral entry. Current strategies have enhanced the sensitivity, specificity, and reproducibility relative to the prediction of co-receptor use; however, these technologies need to be improved with respect to their efficient and accurate use across the HIV-1 subtypes. The most effective approach may center on the combined use of different algorithms involving sequences within and outside of the env-V3 loop. This review focuses on the HIV-1 entry process and on co-receptor utilization, including bioinformatic tools utilized in the prediction of co-receptor usage. It also provides novel preliminary analyses for enabling identification of linkages between amino acids in V3 with other components of the HIV-1 genome and demonstrates that these linkages are different between X4 and R5 viruses. PMID:24862329

  15. Bioinformatics analysis of the epitope regions for norovirus capsid protein

    PubMed Central

    2013-01-01

    Background Norovirus is the major cause of nonbacterial epidemic gastroenteritis, being highly prevalent in both developing and developed countries. Despite of the available monoclonal antibodies (MAbs) for different sub-genogroups, a comprehensive epitope analysis based on various bioinformatics technology is highly desired for future potential antibody development in clinical diagonosis and treatment. Methods A total of 18 full-length human norovirus capsid protein sequences were downloaded from GenBank. Protein modeling was performed with program Modeller 9.9. The modeled 3D structures of capsid protein of norovirus were submitted to the protein antigen spatial epitope prediction webserver (SEPPA) for predicting the possible spatial epitopes with the default threshold. The results were processed using the Biosoftware. Results Compared with GI, we found that the GII genogroup had four deletions and two special insertions in the VP1 region. The predicted conformational epitope regions mainly concentrated on N-terminal (1~96), Middle Part (298~305, 355~375) and C-terminal (560~570). We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. Conclusions The predicted conformational epitope regions of norovirus VP1 mainly concentrated on N-terminal, Middle Part and C-terminal. We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. The overlapping with experimental epitopes indicates the important role of latest computational technologies. With the fast development of computational immunology tools, the bioinformatics pipeline will be more and more critical to vaccine design. PMID:23514273

  16. Edge Bioinformatics

    SciTech Connect

    Lo, Chien-Chi

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance

  17. Edge Bioinformatics

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in amore » genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance« less

  18. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula.

    PubMed

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-04-04

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  19. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    PubMed Central

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-01-01

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula. PMID:27049397

  20. Predictive Bioinformatic Assignment of Methyl-Bearing Stereocenters, Total Synthesis, and an Additional Molecular Target of Ajudazol B.

    PubMed

    Essig, Sebastian; Schmalzbauer, Björn; Bretzke, Sebastian; Scherer, Olga; Koeberle, Andreas; Werz, Oliver; Müller, Rolf; Menche, Dirk

    2016-02-19

    Full details on the evaluation and application of an easily feasible and generally useful method for configurational assignments of isolated methyl-bearing stereocenters are reported. The analytical tool relies on a bioinformatic gene cluster analysis and utilizes a predictive enoylreductase alignment, and its feasibility was demonstrated by the full stereochemical determination of the ajudazols, highly potent inhibitors of the mitochondrial respiratory chain. Furthermore, a full account of our strategies and tactics that culminated in the total synthesis of ajudazol B, the most potent and least abundant of these structurally unique class of myxobacterial natural products, is presented. Key features include an application of an asymmetric ortholithiation strategy for synthesis of the characteristic anti-configured hydroxyisochromanone core bearing three contiguous stereocenters, a modular oxazole formation, a flexible cross-metathesis approach for terminal allyl amide synthesis, and a late-stage Z,Z-selective Suzuki coupling. This total synthesis unambiguously proves the correct stereochemistry, which was further corroborated by comparison with reisolated natural material. Finally, 5-lipoxygenase was discovered as an additional molecular target of ajudazol B. Activities against this clinically validated key enzyme of the biosynthesis of proinflammatory leukotrienes were in the range of the approved drug zileuton, which further underlines the biological importance of this unique natural product.

  1. Predictive Bioinformatic Assignment of Methyl-Bearing Stereocenters, Total Synthesis, and an Additional Molecular Target of Ajudazol B.

    PubMed

    Essig, Sebastian; Schmalzbauer, Björn; Bretzke, Sebastian; Scherer, Olga; Koeberle, Andreas; Werz, Oliver; Müller, Rolf; Menche, Dirk

    2016-02-19

    Full details on the evaluation and application of an easily feasible and generally useful method for configurational assignments of isolated methyl-bearing stereocenters are reported. The analytical tool relies on a bioinformatic gene cluster analysis and utilizes a predictive enoylreductase alignment, and its feasibility was demonstrated by the full stereochemical determination of the ajudazols, highly potent inhibitors of the mitochondrial respiratory chain. Furthermore, a full account of our strategies and tactics that culminated in the total synthesis of ajudazol B, the most potent and least abundant of these structurally unique class of myxobacterial natural products, is presented. Key features include an application of an asymmetric ortholithiation strategy for synthesis of the characteristic anti-configured hydroxyisochromanone core bearing three contiguous stereocenters, a modular oxazole formation, a flexible cross-metathesis approach for terminal allyl amide synthesis, and a late-stage Z,Z-selective Suzuki coupling. This total synthesis unambiguously proves the correct stereochemistry, which was further corroborated by comparison with reisolated natural material. Finally, 5-lipoxygenase was discovered as an additional molecular target of ajudazol B. Activities against this clinically validated key enzyme of the biosynthesis of proinflammatory leukotrienes were in the range of the approved drug zileuton, which further underlines the biological importance of this unique natural product. PMID:26796481

  2. Proteomic and bioinformatic analysis of membrane proteome in type 2 diabetic mouse liver.

    PubMed

    Kim, Gun-Hwa; Park, Edmond Changkyun; Yun, Sung-Ho; Hong, Yeonhee; Lee, Dong-Gyu; Shin, Eun-Young; Jung, Jongsun; Kim, Young Hwan; Lee, Kyung-Bok; Jang, Ik-Soon; Lee, Zee-Won; Chung, Young-Ho; Choi, Jong-Soon; Cheong, Chaejoon; Kim, Soohyun; Kim, Seung Il

    2013-04-01

    Type 2 diabetes mellitus (T2DM) is the most prevalent and serious metabolic disease affecting people worldwide. T2DM results from insulin resistance of the liver, muscle, and adipose tissue. In this study, we used proteomic and bioinformatic methodologies to identify novel hepatic membrane proteins that are related to the development of hepatic insulin resistance, steatosis, and T2DM. Using FT-ICR MS, we identified 95 significantly differentially expressed proteins in the membrane fraction of normal and T2DM db/db mouse liver. These proteins are primarily involved in energy metabolism pathways, molecular transport, and cellular signaling, and many of them have not previously been reported in diabetic studies. Bioinformatic analysis revealed that 16 proteins may be related to the regulation of insulin signaling in the liver. In addition, six proteins are associated with energy stress-induced, nine proteins with inflammatory stress-induced, and 14 proteins with endoplasmic reticulum stress-induced hepatic insulin resistance. Moreover, we identified 19 proteins that may regulate hepatic insulin resistance in a c-Jun amino-terminal kinase-dependent manner. In addition, three proteins, 14-3-3 protein beta (YWHAB), Slc2a4 (GLUT4), and Dlg4 (PSD-95), are discovered by comprehensive bioinformatic analysis, which have correlations with several proteins identified by proteomics approach. The newly identified proteins in T2DM should provide additional insight into the development and pathophysiology of hepatic steatosis and insulin resistance, and they may serve as useful diagnostic markers and/or therapeutic targets for these diseases.

  3. Whale song analyses using bioinformatics sequence analysis approaches

    NASA Astrophysics Data System (ADS)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  4. [Bioinformatics analysis of the expansin gene family in rice].

    PubMed

    Shi, Yang; Xu, Xiao; Li, Haoyang; Xu, Qian; Xu, Jichen

    2014-08-01

    Expansin refers to a family of nonenzymatic proteins found in the plant cell wall with important roles in plant cell growth, developmental processes, and resistance to stress. Whole rice genome sequencing revealed that it contains 58 expansin genes, which belong to 4 subfamilies (A (34), B (19), LA (4) and LB (1)). All the genes were located on 10 of 12 rice chromosomes where several subfamily members clustered. Each of expansin genes ranged from 687 bp to 1128 bp in size. Sequence alignment showed that all expansins had three structural domains with two conserved amino acids of cystine in N-terminus and tryptophan in C-terminus. The amino acid identity of members among different subfamilies was less than 35%, while that among the same subfamily was more than 35%. Most genes of A subfamily had 1 or 2 introns, while genes of B, LA and LB subfamily had 3, 4 and 4 introns, respectively. Statistics analysis of codon usage showed that expansins in rice have 26 high-frequency codons which are more biased than those in other species. These bioinformatics findings will be helpful for the further study of the function and evolution of expansin genes.

  5. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    PubMed

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  6. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis

    PubMed Central

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475

  7. Credibility Analysis of Putative Disease-Causing Genes Using Bioinformatics

    PubMed Central

    Abel, Olubunmi; Powell, John F.; Andersen, Peter M.; Al-Chalabi, Ammar

    2013-01-01

    Background Genetic studies are challenging in many complex diseases, particularly those with limited diagnostic certainty, low prevalence or of old age. The result is that genes may be reported as disease-causing with varying levels of evidence, and in some cases, the data may be so limited as to be indistinguishable from chance findings. When there are large numbers of such genes, an objective method for ranking the evidence is useful. Using the neurodegenerative and complex disease amyotrophic lateral sclerosis (ALS) as a model, and the disease-specific database ALSoD, the objective is to develop a method using publicly available data to generate a credibility score for putative disease-causing genes. Methods Genes with at least one publication suggesting involvement in adult onset familial ALS were collated following an exhaustive literature search. SQL was used to generate a score by extracting information from the publications and combined with a pathogenicity analysis using bioinformatics tools. The resulting score allowed us to rank genes in order of credibility. To validate the method, we compared the objective ranking with a rank generated by ALS genetics experts. Spearman's Rho was used to compare rankings generated by the different methods. Results The automated method ranked ALS genes in the following order: SOD1, TARDBP, FUS, ANG, SPG11, NEFH, OPTN, ALS2, SETX, FIG4, VAPB, DCTN1, TAF15, VCP, DAO. This compared very well to the ranking of ALS genetics experts, with Spearman's Rho of 0.69 (P = 0.009). Conclusion We have presented an automated method for scoring the level of evidence for a gene being disease-causing. In developing the method we have used the model disease ALS, but it could equally be applied to any disease in which there is genotypic uncertainty. PMID:23755159

  8. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    PubMed

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  9. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    PubMed Central

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  10. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    PubMed

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  11. Review of Current Methods, Applications, and Data Management for the Bioinformatics Analysis of Whole Exome Sequencing

    PubMed Central

    Bao, Riyue; Huang, Lei; Andrade, Jorge; Tan, Wei; Kibbe, Warren A; Jiang, Hongmei; Feng, Gang

    2014-01-01

    The advent of next-generation sequencing technologies has greatly promoted advances in the study of human diseases at the genomic, transcriptomic, and epigenetic levels. Exome sequencing, where the coding region of the genome is captured and sequenced at a deep level, has proven to be a cost-effective method to detect disease-causing variants and discover gene targets. In this review, we outline the general framework of whole exome sequence data analysis. We focus on established bioinformatics tools and applications that support five analytical steps: raw data quality assessment, pre-processing, alignment, post-processing, and variant analysis (detection, annotation, and prioritization). We evaluate the performance of open-source alignment programs and variant calling tools using simulated and benchmark datasets, and highlight the challenges posed by the lack of concordance among variant detection tools. Based on these results, we recommend adopting multiple tools and resources to reduce false positives and increase the sensitivity of variant calling. In addition, we briefly discuss the current status and solutions for big data management, analysis, and summarization in the field of bioinformatics. PMID:25288881

  12. Buying in to bioinformatics: an introduction to commercial sequence analysis software.

    PubMed

    Smith, David Roy

    2015-07-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics.

  13. Buying in to bioinformatics: an introduction to commercial sequence analysis software

    PubMed Central

    2015-01-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. PMID:25183247

  14. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    NASA Astrophysics Data System (ADS)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  15. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    PubMed Central

    Thiel, William H

    2016-01-01

    Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment). High-throughput sequencing (HTS) revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  16. Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis and beyond.

    PubMed

    Lee, Sungmin; Min, Hyeyoung; Yoon, Sungroh

    2016-07-01

    A wide variety of large-scale data have been produced in bioinformatics. In response, the need for efficient handling of biomedical big data has been partly met by parallel computing. However, the time demand of many bioinformatics programs still remains high for large-scale practical uses because of factors that hinder acceleration by parallelization. Recently, new generations of storage devices have emerged, such as NAND flash-based solid-state drives (SSDs), and with the renewed interest in near-data processing, they are increasingly becoming acceleration methods that can accompany parallel processing. In certain cases, a simple drop-in replacement of hard disk drives by SSDs results in dramatic speedup. Despite the various advantages and continuous cost reduction of SSDs, there has been little review of SSD-based profiling and performance exploration of important but time-consuming bioinformatics programs. For an informative review, we perform in-depth profiling and analysis of 23 key bioinformatics programs using multiple types of devices. Based on the insight we obtain from this research, we further discuss issues related to design and optimize bioinformatics algorithms and pipelines to fully exploit SSDs. The programs we profile cover traditional and emerging areas of importance, such as alignment, assembly, mapping, expression analysis, variant calling and metagenomics. We explain how acceleration by parallelization can be combined with SSDs for improved performance and also how using SSDs can expedite important bioinformatics pipelines, such as variant calling by the Genome Analysis Toolkit and transcriptome analysis using RNA sequencing. We hope that this review can provide useful directions and tips to accompany future bioinformatics algorithm design procedures that properly consider new generations of powerful storage devices. PMID:26330577

  17. [Bioinformatics analysis of DNA demethylase genes in Lonicera japonica Thunb].

    PubMed

    Qi, Lin-jie; Yuan, Yuan; Wu, Chong; Huang, Lu-qi; Chen, Ping

    2015-03-01

    The DNA demethylase genes are widespread in plants. Four DNA demethylase genes (LJDME1, LJDME2, LJDME3 and LJDME4) were obtained from transcriptome dataset of Lonicera japonica Thunb by using bioinformatics methods and the proteins' physicochemical properties they encoded were predicted. The phylogenetic tree showed that the four DNA demethylase genes and Arabidopsis thaliana DME had a close relationship. The result of gene expression model showed that four DNA demethylase genes were different between species. The expression levels of LJDME1 and LJDME2 were even more higher in Lonicera japonica var. chinensis than those in L. japonica. LJDME] and LJDME2 maybe regulate the active compounds of L. japonica. This study aims to lay a foundation for further understanding of the function of DNA demethylase genes in L. japonica.

  18. Bioinformatics analysis of gene expression profiles of dermatomyositis.

    PubMed

    Chen, Liang-Yuan; Cui, Zhao-Lei; Hua, Fan-Cui; Yang, Weng-Jing; Bai, Ye; Lan, Feng-Hua

    2016-10-01

    Dermatomyositis (DM) is a type of autoimmune inflammatory myopathy, which primarily affects the skin and muscle. The underlying mechanisms of DM remain poorly understood. The present study aimed to explore gene expression profile alterations, investigate the underlying mechanisms, and identify novel targets for DM. The GSE48280 dataset, which includes data from five DM and five normal muscle tissue samples, was obtained from the Gene Expression Omnibus. Firstly, differentially expressed genes (DEGs) were screened by limma package in R. Subsequently, functional and pathway enrichment analyses were performed using ClueGO from Cytoscape. Finally, protein‑protein interaction (PPI) networks were constructed using STRING and Cytoscape, in order to identify hub genes. As a result, 180 upregulated and 21 downregulated genes were identified in the DM samples. The Gene Ontology enrichment analysis revealed that the type I interferon (IFN) signaling pathway was the most significantly enriched term within the DEGs. The Kyoto Encyclopedia of Genes and Genomes pathway analysis identified 27 significant pathways, the majority of which can be divided into the infectious diseases and immune system categories. Following construction of PPI networks, 24 hub genes were selected, all of which were associated with the type I IFN signaling pathway in DM. The findings of the present study indicated that type I IFNs may have a central role in the induction of DM. In addition, other DEGs, including chemokine (C‑C motif) ligand 5, C‑X‑C motif chemokine 10, Toll‑like receptor 3, DEXD/H‑Box helicase 58, interferon induced with helicase C domain 1, interferon‑stimulated gene 15 and MX dynamin‑like GTPase 1, may be potential targets for DM diagnosis and treatment. PMID:27599581

  19. [BIOINFORMATIC SEARCH AND PHYLOGENETIC ANALYSIS OF THE CELLULOSE SYNTHASE GENES OF FLAX (LINUM USITATISSIMUM)].

    PubMed

    Pydiura, N A; Bayer, G Ya; Galinousky, D V; Yemets, A I; Pirko, Ya V; Podvitski, T A; Anisimova, N V; Khotyleva, L V; Kilchevsky, A V; Blume, Ya B

    2015-01-01

    A bioinformatic search of sequences encoding cellulose synthase genes in the flax genome, and their comparison to dicots orthologs was carried out. The analysis revealed 32 cellulose synthase gene candidates, 16 of which are highly likely to encode cellulose synthases, and the remaining 16--cellulose synthase-like proteins (Csl). Phylogenetic analysis of gene products of cellulose synthase genes allowed distinguishing 6 groups of cellulose synthase genes of different classes: CesA1/10, CesA3, CesA4, CesA5/6/2/9, CesA7 and CesA8. Paralogous sequences within classes CesA1/10 and CesA5/6/2/9 which are associated with the primary cell wall formation are characterized by a greater similarity within these classes than orthologous sequences. Whereas the genes controlling the biosynthesis of secondary cell wall cellulose form distinct clades: CesA4, CesA7, and CesA8. The analysis of 16 identified flax cellulose synthase gene candidates shows the presence of at least 12 different cellulose synthase gene variants in flax genome which are represented in all six clades of cellulose synthase genes. Thus, at this point genes of all ten known cellulose synthase classes are identify in flax genome, but their correct classification requires additional research. PMID:26638491

  20. [BIOINFORMATIC SEARCH AND PHYLOGENETIC ANALYSIS OF THE CELLULOSE SYNTHASE GENES OF FLAX (LINUM USITATISSIMUM)].

    PubMed

    Pydiura, N A; Bayer, G Ya; Galinousky, D V; Yemets, A I; Pirko, Ya V; Podvitski, T A; Anisimova, N V; Khotyleva, L V; Kilchevsky, A V; Blume, Ya B

    2015-01-01

    A bioinformatic search of sequences encoding cellulose synthase genes in the flax genome, and their comparison to dicots orthologs was carried out. The analysis revealed 32 cellulose synthase gene candidates, 16 of which are highly likely to encode cellulose synthases, and the remaining 16--cellulose synthase-like proteins (Csl). Phylogenetic analysis of gene products of cellulose synthase genes allowed distinguishing 6 groups of cellulose synthase genes of different classes: CesA1/10, CesA3, CesA4, CesA5/6/2/9, CesA7 and CesA8. Paralogous sequences within classes CesA1/10 and CesA5/6/2/9 which are associated with the primary cell wall formation are characterized by a greater similarity within these classes than orthologous sequences. Whereas the genes controlling the biosynthesis of secondary cell wall cellulose form distinct clades: CesA4, CesA7, and CesA8. The analysis of 16 identified flax cellulose synthase gene candidates shows the presence of at least 12 different cellulose synthase gene variants in flax genome which are represented in all six clades of cellulose synthase genes. Thus, at this point genes of all ten known cellulose synthase classes are identify in flax genome, but their correct classification requires additional research.

  1. Bioinformatics Analysis of Small RNAs in Pima (Gossypium barbadense L.)

    PubMed Central

    Hu, Hongtao; Yu, Dazhao; Liu, Hong

    2015-01-01

    Small RNAs (sRNAs) are ~20 to 24 nucleotide single-stranded RNAs that play crucial roles in regulation of gene expression. In plants, sRNAs are classified into microRNAs (miRNAs), repeat-associated siRNAs (ra-siRNAs), phased siRNAs (pha-siRNAs), cis and trans natural antisense transcript siRNAs (cis- and trans-nat siRNAs). Pima (Gossypium barbadense L.) is one of the most economically important fiber crops, producing the best and longest spinnable fiber. Although some miRNAs are profiled in Pima, little is known about siRNAs, the largest subclass of plant sRNAs. In order to profile these gene regulators in Pima, a comprehensive analysis of sRNAs was conducted by mining publicly available sRNA data, leading to identification of 678 miRNAs, 3,559,126 ra-siRNAs, 627 pha-siRNAs, 136,600 cis-nat siRNAs and 79,994 trans-nat siRNAs. The 678 miRNAs, belonging to 98 conserved and 402 lineage-specific families, were produced from 2,138 precursors, of which 297 arose from introns, exons, or intron/UTR-exon junctions of protein-coding genes. Ra-siRNAs were produced from various repeat loci, while most (97%) were yielded from retrotransposons, especially LTRs (long terminal repeats). The genes encoding auxin-signaling-related proteins, NBS-LRRs and transcription factors were major sources of pha-siRNAs, while two conserved TAS3 homologs were found as well. Most cis-NATs in Pima overlapped in enclosed and convergent orientations, while a few hybridized in divergent and coincided orientations. Most cis- and trans-nat siRNAs were produced from overlapping regions. Additionally, characteristics of length and the 5’-first nucleotide of each sRNA class were analyzed as well. Results in this study created a valuable molecular resource that would facilitate studies on mechanism of controlling gene expression. PMID:25679373

  2. Bioinformatic analysis of functional proteins involved in obesity associated with diabetes.

    PubMed

    Rao, Allam Appa; Tayaru, N Manga; Thota, Hanuman; Changalasetty, Suresh Babu; Thota, Lalitha Saroja; Gedela, Srinubabu

    2008-03-01

    The twin epidemic of diabetes and obesity pose daunting challenges worldwide. The dramatic rise in obesity-associated diabetes resulted in an alarming increase in the incidence and prevalence of obesity an important complication of diabetes. Differences among individuals in their susceptibility to both these conditions probably reflect their genetic constitutions. The dramatic improvements in genomic and bioinformatic resources are accelerating the pace of gene discovery. It is tempting to speculate the key susceptible genes/proteins that bridges diabetes mellitus and obesity. In this regard, we evaluated the role of several genes/proteins that are believed to be involved in the evolution of obesity associated diabetes by employing multiple sequence alignment using ClustalW tool and constructed a phylogram tree using functional protein sequences extracted from NCBI. Phylogram was constructed using Neighbor-Joining Algorithm a bioinformatic tool. Our bioinformatic analysis reports resistin gene as ominous link with obesity associated diabetes. This bioinformatic study will be useful for future studies towards therapeutic inventions of obesity associated type 2 diabetes. PMID:23675069

  3. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  4. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.

  5. Mudi, a web tool for identifying mutations by bioinformatics analysis of whole-genome sequence.

    PubMed

    Iida, Naoko; Yamao, Fumiaki; Nakamura, Yasukazu; Iida, Tetsushi

    2014-06-01

    In forward genetics, identification of mutations is a time-consuming and laborious process. Modern whole-genome sequencing, coupled with bioinformatics analysis, has enabled fast and cost-effective mutation identification. However, for many experimental researchers, bioinformatics analysis is still a difficult aspect of whole-genome sequencing. To address this issue, we developed a browser-accessible and easy-to-use bioinformatics tool called Mutation discovery (Mudi; http://naoii.nig.ac.jp/mudi_top.html), which enables 'one-click' identification of causative mutations from whole-genome sequence data. In this study, we optimized Mudi for pooled-linkage analysis aimed at identifying mutants in yeast model systems. After raw sequencing data are uploaded, Mudi performs sequential analysis, including mapping, detection of variant alleles, filtering and removal of background polymorphisms, prioritization, and annotation. In an example study of suppressor mutants of ptr1-1 in the fission yeast Schizosaccharomyces pombe, pooled-linkage analysis with Mudi identified mip1(+) , a component of Target of Rapamycin Complex 1 (TORC1), as a novel component involved in RNA interference (RNAi)-related cell-cycle control. The accessibility of Mudi will accelerate systematic mutation analysis in forward genetics.

  6. The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis

    PubMed Central

    Rampp, Markus; Soddemann, Thomas; Lederer, Hermann

    2006-01-01

    We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’). PMID:16844980

  7. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects. PMID:26351170

  8. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.

  9. Bioinformatics analysis of the gene expression profile of hepatocellular carcinoma: preliminary results

    PubMed Central

    Li, Jia

    2016-01-01

    Aim of the study To analyse the expression profile of hepatocellular carcinoma compared with normal liver by using bioinformatics methods. Material and methods In this study, we analysed the microarray expression data of HCC and adjacent normal liver samples from the Gene Expression Omnibus (GEO) database to screen for differentially expressed genes. Then, functional analyses were performed using GenCLiP analysis, Gene Ontology categories, and aberrant pathway identification. In addition, we used the CMap database to identify small molecules that can induce HCC. Results Overall, 2721 differentially expressed genes (DEGs) were identified. We found 180 metastasis-related genes and constructed co-occurrence networks. Several significant pathways, including the transforming growth factor β (TGF-β) signalling pathway, were identified as closely related to these DEGs. Some candidate small molecules (such as betahistine) were identified that might provide a basis for developing HCC treatments in the future. Conclusions Although we functionally analysed the differences in the gene expression profiles of HCC and normal liver tissues, our study is essentially preliminary, and it may be premature to apply our results to clinical trials. Further research and experimental testing are required in future studies. PMID:27095935

  10. Deep Sequencing Analysis of Nucleolar Small RNAs: Bioinformatics.

    PubMed

    Bai, Baoyan; Laiho, Marikki

    2016-01-01

    Small RNAs (size 20-30 nt) of various types have been actively investigated in recent years, and their subcellular compartmentalization and relative concentrations are likely to be of importance to their cellular and physiological functions. Comprehensive data on this subset of the transcriptome can only be obtained by application of high-throughput sequencing, which yields data that are inherently complex and multidimensional, as sequence composition, length, and abundance will all inform to the small RNA function. Subsequent data analysis, hypothesis testing, and presentation/visualization of the results are correspondingly challenging. We have constructed small RNA libraries derived from different cellular compartments, including the nucleolus, and asked whether small RNAs exist in the nucleolus and whether they are distinct from cytoplasmic and nuclear small RNAs, the miRNAs. Here, we present a workflow for analysis of small RNA sequencing data generated by the Ion Torrent PGM sequencer from samples derived from different cellular compartments. PMID:27576724

  11. SeqCalc: A portable bioinformatics software for sequence analysis

    PubMed Central

    Vignesh, Dhandapani; Parameswari, Paul; Jin, Kim Hae; Pyo, Lim Yong

    2010-01-01

    Rapid genome sequencing enriched biological databases with enormous sequence data. Yet it remains a daunting task to unravel this information. However experimental and computational researchers lead their own way in analyzing sequence information. Here we introduce a standalone portable tool named “SeqCalc” that would assist the research personnel in computational sequence analysis and automated experimental calculations. Although several tools are available online for sequence analysis they serve only for one or two purposes. SeqCalc is a package of offline program, developed using Perl and TCL/Tk scripts that serve ten different applications. This tool would be an initiative to both experimental and computational researchers in their routine research. SeqCalc is executable in all windows operating systems. Availability SeqCalc can be freely downloaded at http://code.google.com/p/seqcalc. PMID:21364786

  12. A bioinformatics perspective on proteomics: data storage, analysis, and integration.

    PubMed

    Kremer, Andreas; Schneider, Reinhard; Terstappen, Georg C

    2005-01-01

    The field of proteomics is advancing rapidly as a result of powerful new technologies and proteomics experiments yield a vast and increasing amount of information. Data regarding protein occurrence, abundance, identity, sequence, structure, properties, and interactions need to be stored. Currently, a common standard has not yet been established and open access to results is needed for further development of robust analysis algorithms. Databases for proteomics will evolve from pure storage into knowledge resources, providing a repository for information (meta-data) which is mainly not stored in simple flat files. This review will shed light on recent steps towards the generation of a common standard in proteomics data storage and integration, but is not meant to be a comprehensive overview of all available databases and tools in the proteomics community.

  13. STRUCTURELAB: a heterogeneous bioinformatics system for RNA structure analysis.

    PubMed

    Shapiro, B A; Kasprzak, W

    1996-08-01

    STRUCTURELAB is a computational system that has been developed to permit the use of a broad array of approaches for the analysis of the structure of RNA. The goal of the development is to provide a large set of tools that can be well integrated with experimental biology to aid in the process of the determination of the underlying structure of RNA sequences. The approach taken views the structure determination problem as one of dealing with a database of many computationally generated structures and provides the capability to analyze this data set from different perspectives. Many algorithms are integrated into one system that also utilizes a heterogeneous computing approach permitting the use of several computer architectures to help solve the posed problems. These different computational platforms make it relatively easy to incorporate currently existing programs as well as newly developed algorithms and to best match these algorithms to the appropriate hardware. The system has been written in Common Lisp running on SUN or SGI Unix workstations, and it utilizes a network of participating machines defined in reconfigurable tables. A window-based interface makes this heterogeneous environment as transparent to the user as possible. PMID:9076633

  14. Integration and bioinformatics analysis of DNA-methylated genes associated with drug resistance in ovarian cancer

    PubMed Central

    YAN, BINGBING; YIN, FUQIANG; WANG, QI; ZHANG, WEI; LI, LI

    2016-01-01

    The main obstacle to the successful treatment of ovarian cancer is the development of drug resistance to combined chemotherapy. Among all the factors associated with drug resistance, DNA methylation apparently plays a critical role. In this study, we performed an integrative analysis of the 26 DNA-methylated genes associated with drug resistance in ovarian cancer, and the genes were further evaluated by comprehensive bioinformatics analysis including gene/protein interaction, biological process enrichment and annotation. The results from the protein interaction analyses revealed that at least 20 of these 26 methylated genes are present in the protein interaction network, indicating that they interact with each other, have a correlation in function, and may participate as a whole in the regulation of ovarian cancer drug resistance. There is a direct interaction between the phosphatase and tensin homolog (PTEN) gene and at least half of the other genes, indicating that PTEN may possess core regulatory functions among these genes. Biological process enrichment and annotation demonstrated that most of these methylated genes were significantly associated with apoptosis, which is possibly an essential way for these genes to be involved in the regulation of multidrug resistance in ovarian cancer. In addition, a comprehensive analysis of clinical factors revealed that the methylation level of genes that are associated with the regulation of drug resistance in ovarian cancer was significantly correlated with the prognosis of ovarian cancer. Overall, this study preliminarily explains the potential correlation between the genes with DNA methylation and drug resistance in ovarian cancer. This finding has significance for our understanding of the regulation of resistant ovarian cancer by methylated genes, the treatment of ovarian cancer, and improvement of the prognosis of ovarian cancer. PMID:27347118

  15. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology.

    PubMed

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-12-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  16. Bioinformatics: promises and progress.

    PubMed

    Gupta, Shipra; Misra, Gauri; Khurana, S M Paul

    2015-01-01

    Bioinformatics is a multidisciplinary science that solves and analyzes biological problems. With the quantum explosion in biomedical data, the demand of bioinformatics has increased gradually. Present paper provides an overview of various ways through which the biologists or biological researchers in the domain of neurology, structural and functional biology, evolutionary biology, clinical science, etc., use bioinformatics applications for data analysis to summarise their research. A new perspective is used to classify the knowledge available in the field thus will help general audience to understand the application of bioinformatics.

  17. [Cloning and bioinformatic analysis and expression analysis of beta-glucuronidase in Scutellaria baicalensis].

    PubMed

    Guo, Shuang-shuang; Cheng, Lin; Yang, Li-min; Han, Mei

    2015-11-01

    The β-Glucuronidase gene (sbGUS) cDNA firstly from Scutellari abaicalensis leaf was cloned by RT-PCR, with GenBank accession number KR364726. The full length cDNA of sbGUS was 1 584 bp with an open reading frame (ORF), encoding an unstable protein with 527 amino acids. The bioinformatic analysis showed that the sbGUS encoding protein had isoelectric point (pI) of 5.55 and a calculated molecular weight about 58.724 8 kDa, with a transmembrane regions and signal peptide, had conserved domains of glycoside hydrolase super family and unintegrated trans-glycosidase catalytic structure. In the secondary structure, the percentage of alpha helix, extended strand, β-extended and random coil were 25.62%, 28.84%, 13.28% and 32.26%, respectively. The homologous analysis indicated the nucleotide sequence 98.93% similarity and the amino acid sequence 98.29% similarity with S. baicalensis (BAA97804.1), in the nine positions were different. The expression level of sGUS was the highest in root based on a real-time PCR analysis, followed by flower and stem, and the lowest was in stem. The results provide a foundation for exploring the molecular function of sbGUS involved in baicalcin biosynthesis based on synthetic biology approach in S. baicalensis plants. PMID:27097409

  18. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data

    PubMed Central

    Chen, Chen; Zhang, Li-Guo; Liu, Jian; Han, Hui; Chen, Ning; Yao, An-Liang; Kang, Shao-San; Gao, Wei-Xing; Shen, Hong; Zhang, Long-Jun; Li, Ya-Peng; Cao, Feng-Hong; Li, Zhi-Guo

    2016-01-01

    We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa) through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs) into two groups: the group consisting of PCa and benign tissues (P&b) and the group presenting both high and low PCa metastatic tendencies (H&L). In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM)–receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore, the implicated biological processes and signaling pathways may help elucidate the molecular mechanisms of PCa carcinogenesis and metastasis and provide new targets for clinical treatment. PMID:27051295

  19. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data.

    PubMed

    Chen, Chen; Zhang, Li-Guo; Liu, Jian; Han, Hui; Chen, Ning; Yao, An-Liang; Kang, Shao-San; Gao, Wei-Xing; Shen, Hong; Zhang, Long-Jun; Li, Ya-Peng; Cao, Feng-Hong; Li, Zhi-Guo

    2016-01-01

    We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa) through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs) into two groups: the group consisting of PCa and benign tissues (P&b) and the group presenting both high and low PCa metastatic tendencies (H&L). In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM)-receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore, the implicated biological processes and signaling pathways may help elucidate the molecular mechanisms of PCa carcinogenesis and metastasis and provide new targets for clinical treatment. PMID:27051295

  20. Secretome Analysis of Lipid-Induced Insulin Resistance in Skeletal Muscle Cells by a Combined Experimental and Bioinformatics Workflow.

    PubMed

    Deshmukh, Atul S; Cox, Juergen; Jensen, Lars Juhl; Meissner, Felix; Mann, Matthias

    2015-11-01

    Skeletal muscle has emerged as an important secretory organ that produces so-called myokines, regulating energy metabolism via autocrine, paracrine, and endocrine actions; however, the nature and extent of the muscle secretome has not been fully elucidated. Mass spectrometry (MS)-based proteomics, in principle, allows an unbiased and comprehensive analysis of cellular secretomes; however, the distinction of bona fide secreted proteins from proteins released upon lysis of a small fraction of dying cells remains challenging. Here we applied highly sensitive MS and streamlined bioinformatics to analyze the secretome of lipid-induced insulin-resistant skeletal muscle cells. Our workflow identified 1073 putative secreted proteins including 32 growth factors, 25 cytokines, and 29 metalloproteinases. In addition to previously reported proteins, we report hundreds of novel ones. Intriguingly, ∼40% of the secreted proteins were regulated under insulin-resistant conditions, including a protein family with signal peptide and EGF-like domain structure that had not yet been associated with insulin resistance. Finally, we report that secretion of IGF and IGF-binding proteins was down-regulated under insulin-resistant conditions. Our study demonstrates an efficient combined experimental and bioinformatics workflow to identify putative secreted proteins from insulin-resistant skeletal muscle cells, which could easily be adapted to other cellular models.

  1. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    ERIC Educational Resources Information Center

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  2. Genome-wide variant analysis of simplex autism families with an integrative clinical-bioinformatics pipeline

    PubMed Central

    Jiménez-Barrón, Laura T.; O'Rawe, Jason A.; Wu, Yiyang; Yoon, Margaret; Fang, Han; Iossifov, Ivan; Lyon, Gholson J.

    2015-01-01

    Autism spectrum disorders (ASDs) are a group of developmental disabilities that affect social interaction and communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASDs, in which many different loci are involved. Although many current population-scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole-genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de novo, autosomal recessive, X-linked, mitochondrial, and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous copy-number variations (CNVs), a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole-genome sequencing data can generate reliable results for use in downstream investigations. PMID:27148569

  3. Zebra: a web server for bioinformatic analysis of diverse protein families.

    PubMed

    Suplatov, Dmitry; Kirilin, Evgeny; Takhaveev, Vakil; Svedas, Vytas

    2014-01-01

    During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology - understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure-function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server - a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure-function relationship. Zebra results are provided in two ways - (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra .

  4. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  5. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis.

    PubMed

    Nasir, Waqas; Toledo, Alejandro Gomez; Noborn, Fredrik; Nilsson, Jonas; Wang, Mingxun; Bandeira, Nuno; Larson, Göran

    2016-08-01

    Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans. PMID:27399812

  6. proBAMsuite, a Bioinformatics Framework for Genome-Based Representation and Analysis of Proteomics Data*

    PubMed Central

    Wang, Xiaojing; Slebos, Robbert J. C.; Chambers, Matthew C.; Tabb, David L.; Liebler, Daniel C.; Zhang, Bing

    2016-01-01

    To facilitate genome-based representation and analysis of proteomics data, we developed a new bioinformatics framework, proBAMsuite, in which a central component is the protein BAM (proBAM) file format for organizing peptide spectrum matches (PSMs)1 within the context of the genome. proBAMsuite also includes two R packages, proBAMr and proBAMtools, for generating and analyzing proBAM files, respectively. Applying proBAMsuite to three recently published proteomics datasets, we demonstrated its utility in facilitating efficient genome-based sharing, interpretation, and integration of proteomics data. First, the interpretation of proteomics data is significantly enhanced with the rich genomic annotation information. Second, PSMs can be easily reannotated using user-specified gene annotation schemes and assembled into both protein and gene identifications. Third, using the genome as a common reference, proBAMsuite facilitates seamless proteomics and proteogenomics data integration. Finally, proBAM files can be readily visualized in genome browsers and thus bring proteomics data analysis to a general audience beyond the proteomics community. Results from this study establish proBAMsuite as a useful bioinformatics framework for proteomics and proteogenomics research. PMID:26657539

  7. SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis.

    PubMed

    Nasir, Waqas; Toledo, Alejandro Gomez; Noborn, Fredrik; Nilsson, Jonas; Wang, Mingxun; Bandeira, Nuno; Larson, Göran

    2016-08-01

    Glycoproteomics has rapidly become an independent analytical platform bridging the fields of glycomics and proteomics to address site-specific protein glycosylation and its impact in biology. Current glycopeptide characterization relies on time-consuming manual interpretations and demands high levels of personal expertise. Efficient data interpretation constitutes one of the major challenges to be overcome before true high-throughput glycopeptide analysis can be achieved. The development of new glyco-related bioinformatics tools is thus of crucial importance to fulfill this goal. Here we present SweetNET: a data-oriented bioinformatics workflow for efficient analysis of hundreds of thousands of glycopeptide MS/MS-spectra. We have analyzed MS data sets from two separate glycopeptide enrichment protocols targeting sialylated glycopeptides and chondroitin sulfate linkage region glycopeptides, respectively. Molecular networking was performed to organize the glycopeptide MS/MS data based on spectral similarities. The combination of spectral clustering, oxonium ion intensity profiles, and precursor ion m/z shift distributions provided typical signatures for the initial assignment of different N-, O- and CS-glycopeptide classes and their respective glycoforms. These signatures were further used to guide database searches leading to the identification and validation of a large number of glycopeptide variants including novel deoxyhexose (fucose) modifications in the linkage region of chondroitin sulfate proteoglycans.

  8. [Bioinformatic analysis of adenoma-normal mucosa SSH library of colon].

    PubMed

    Lü, Bing-Jian; Cui, Jing; Xu, Jing; Zhang, Hao; Luo, Min-Jie; Zhu, Yi-Min; Lai, Mao-De

    2006-04-01

    We established a colonic adenoma-normal mucosa suppressive subtraction hybridization (SSH) library in 1999. In this study, we wanted to explore the expression profile of all candidate genes in this library. We developed an EST pipeline which contained two in-house software packages, nucleic acid analytical software and GetUni. The nucleic acid analytical software, an integrator of the universal bioinformatics tools including phred, phd2fasta, cross_match, repeatmasker and blast2.0, can blast sequences of differential clones with the downloaded non-redundant nucleotide (NR) database. GetUni can cluster these NR sequences into Unigene via matching with the downloaded Homo Sapiens UniGene database. Sixty-two candidate genes in A-N library were obtained via the high throughput automatic gene expression bioinformatics pipeline. Gene Ontology online analysis revealed that ribosome genes and immunity-regulating genes were the two most common categories in the KEGG or Biocarta Pathway. We also detected the expression of 2 genes with highest hits, Reg4 and FAM46A, by semi-quantitative RT-PCR. Both genes were up-regulated in 10 or 9 out of 10 adenomas in comparison with the paired normal mucosa, respectively. The candidate genes in A-N library would be of great significance in disclosing the molecular mechanism underlying in colonic adenoma initiation and progression.

  9. Nautilus: a bioinformatics package for the analysis of HIV type 1 targeted deep sequencing data.

    PubMed

    Kijak, Gustavo H; Pham, Phuc; Sanders-Buell, Eric; Harbolick, Elizabeth A; Eller, Leigh Anne; Robb, Merlin L; Michael, Nelson L; Kim, Jerome H; Tovanabutra, Sodsai

    2013-10-01

    The advent of next generation sequencing technologies is providing new insight into HIV-1 diversity and evolution, which has created the need for bioinformatics tools that could be applied to the characterization of viral quasispecies. Here we present Nautilus, a bioinformatics package for the analysis of HIV-1 targeted deep sequencing data. The DeepHaplo module determines the nucleotide base frequency and read depth at each position and computes the haplotype frequencies based on the linkage among polymorphisms in the same next generation sequence read. The Motifs module computes the frequency of the variants in the setting of their sequence context and mapping orientation, which allows for the validation of polymorphisms and haplotypes when strand bias is suspected. Both modules are accessed through a user-friendly GUI, which runs on Mac OS X (version 10.7.4 or later), and are based on Python, JAVA, and R scripts. Nautilus is available from www.hivresearch.org/research.php?ServiceID=5&SubServiceID=6 . PMID:23809062

  10. Adaptation of a Bioinformatics Microarray Analysis Workflow for a Toxicogenomic Study in Rainbow Trout

    PubMed Central

    Depiereux, Sophie; De Meulder, Bertrand; Bareke, Eric; Berger, Fabrice; Le Gac, Florence; Depiereux, Eric; Kestemont, Patrick

    2015-01-01

    Sex steroids play a key role in triggering sex differentiation in fish, the use of exogenous hormone treatment leading to partial or complete sex reversal. This phenomenon has attracted attention since the discovery that even low environmental doses of exogenous steroids can adversely affect gonad morphology (ovotestis development) and induce reproductive failure. Modern genomic-based technologies have enhanced opportunities to find out mechanisms of actions (MOA) and identify biomarkers related to the toxic action of a compound. However, high throughput data interpretation relies on statistical analysis, species genomic resources, and bioinformatics tools. The goals of this study are to improve the knowledge of feminisation in fish, by the analysis of molecular responses in the gonads of rainbow trout fry after chronic exposure to several doses (0.01, 0.1, 1 and 10 μg/L) of ethynylestradiol (EE2) and to offer target genes as potential biomarkers of ovotestis development. We successfully adapted a bioinformatics microarray analysis workflow elaborated on human data to a toxicogenomic study using rainbow trout, a fish species lacking accurate functional annotation and genomic resources. The workflow allowed to obtain lists of genes supposed to be enriched in true positive differentially expressed genes (DEGs), which were subjected to over-representation analysis methods (ORA). Several pathways and ontologies, mostly related to cell division and metabolism, sexual reproduction and steroid production, were found significantly enriched in our analyses. Moreover, two sets of potential ovotestis biomarkers were selected using several criteria. The first group displayed specific potential biomarkers belonging to pathways/ontologies highlighted in the experiment. Among them, the early ovarian differentiation gene foxl2a was overexpressed. The second group, which was highly sensitive but not specific, included the DEGs presenting the highest fold change and lowest p

  11. Genomic expression profiling and bioinformatics analysis on diabetic nephrology with ginsenoside Rg3

    PubMed Central

    Wang, Juan; Cui, Chunli; Fu, Li; Xiao, Zili; Xie, Nanzi; Liu, Yang; Yu, Lu; Wang, Haifeng; Luo, Bangzhen

    2016-01-01

    Diabetic nephropathy (DN), a common diabetes-related complication, is the leading cause of progressive chronic kidney disease (CKD) and end-stage renal disease. Despite the rapid development in the treatment of DN, currently available therapies used in early DN cannot prevent progressive CKD. The exact pathogenic mechanisms and the molecular events underlying DN development remain unclear. Ginsenoside Rg3 is a herbal medicine with numerous pharmacological effects. To gain a greater understanding of the molecular mechanism and signaling pathway underlying the effect of ginsenoside Rg3 in DN therapy, an RNA sequencing approach was performed to screen differential gene expression in a rat model of DN treated with ginsenoside Rg3. A combined bioinformatics analysis was then conducted to obtain insights into the underlying molecular mechanisms of the disease development, in order to identify potential novel targets for the treatment of DN. Six Sprague-Dawley male rats were randomly divided into 3 groups: Normal control group, DN group and ginsenoside-Rg3 treatment group, with two rats in each group. RNA sequencing was adopted for transcriptome profiling of cells from the renal cortex of DN rat model. Differentially expressed genes were screened out. Cluster analysis, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis were used to analyze the differentially expressed genes. In total, 78 differentially expressed genes in the DN control group were identified when compared with the normal control group, of which 52 genes were upregulated and 26 genes were downregulated. Differential expression of 43 genes was observed in the ginsenoside-Rg3 treatment group when compared with the DN control group, consisting of 10 upregulated genes and 33 downregulated genes. Notably, 21 that were downregulated in the DN control group compared with the control were then shown to be upregulated in the ginsenoside-Rg3 treatment group compared with the DN

  12. Shared Genetic Etiology between Type 2 Diabetes and Alzheimer's Disease Identified by Bioinformatics Analysis.

    PubMed

    Gao, Lei; Cui, Zhen; Shen, Liang; Ji, Hong-Fang

    2015-01-01

    Type 2 diabetes (T2D) and Alzheimer's disease (AD) are two major health issues, and increasing evidence in recent years supports the close connection between these two diseases. The present study aimed to explore the shared genetic etiology underlying T2D and AD based on the available genome wide association studies (GWAS) data collected through August 2014. We performed bioinformatics analyses based on GWAS data of T2D and AD on single nucleotide polymorphisms (SNPs), gene, and pathway levels, respectively. Six SNPs (rs111789331, rs12721046, rs12721051, rs4420638, rs56131196, and rs66626994) were identified for the first time to be shared genetic factors between T2D and AD. Further functional enrichment analysis found lipid metabolism related pathways to be common between these two disorders. The findings may have important implications for future mechanistic and interventional studies for T2D and AD. PMID:26639962

  13. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    PubMed

    Ju, Feng; Zhang, Tong

    2015-11-01

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation.

  14. Bioinformatic analysis of non-VP1 capsid protein of coxsackievirus A6.

    PubMed

    Liu, Hong-Bo; Yang, Guang-Fei; Liang, Si-Jia; Lin, Jun

    2016-08-01

    This study bioinformatically analyzed the non-VP1 capsid proteins (VP2-VP4) of Coxasckievirus A6 (CVA6), with an attempt to predict their basic physicochemical properties, structural/functional features and linear B cell eiptopes. The online tools SubLoc, TargetP and the others from ExPASy Bioinformatics Resource Portal, and SWISS-MODEL (an online protein structure modeling server), were utilized to analyze the amino acid (AA) sequences of VP2-VP4 proteins of CVA6. Our results showed that the VP proteins of CVA6 were all of hydrophilic nature, contained phosphorylation and glycosylation sites and harbored no signal peptide sequences and acetylation sites. Except VP3, the other proteins did not have transmembrane helix structure and nuclear localization signal sequences. Random coils were the major conformation of the secondary structure of the capsid proteins. Analysis of the linear B cell epitopes by employing Bepipred showed that the average antigenic indices (AI) of individual VP proteins were all greater than 0 and the average AI of VP4 was substantially higher than that of VP2 and VP3. The VP proteins all contained a number of potential B cell epitopes and some eiptopes were located at the internal side of the viral capsid or were buried. We successfully predicted the fundamental physicochemical properties, structural/functional features and the linear B cell eiptopes and found that different VP proteins share some common features and each has its unique attributes. These findings will help us understand the pathogenicity of CVA6 and develop related vaccines and immunodiagnostic reagents. PMID:27465341

  15. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines.

    PubMed

    D'Argenio, Valeria; Casaburi, Giorgio; Precone, Vincenza; Salvatore, Francesco

    2014-01-01

    Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times. PMID:24719854

  16. Human, vector and parasite Hsp90 proteins: A comparative bioinformatics analysis

    PubMed Central

    Faya, Ngonidzashe; Penkler, David L.; Tastan Bishop, Özlem

    2015-01-01

    The treatment of protozoan parasitic diseases is challenging, and thus identification and analysis of new drug targets is important. Parasites survive within host organisms, and some need intermediate hosts to complete their life cycle. Changing host environment puts stress on parasites, and often adaptation is accompanied by the expression of large amounts of heat shock proteins (Hsps). Among Hsps, Hsp90 proteins play an important role in stress environments. Yet, there has been little computational research on Hsp90 proteins to analyze them comparatively as potential parasitic drug targets. Here, an attempt was made to gain detailed insights into the differences between host, vector and parasitic Hsp90 proteins by large-scale bioinformatics analysis. A total of 104 Hsp90 sequences were divided into three groups based on their cellular localizations; namely cytosolic, mitochondrial and endoplasmic reticulum (ER). Further, the parasitic proteins were divided according to the type of parasite (protozoa, helminth and ectoparasite). Primary sequence analysis, phylogenetic tree calculations, motif analysis and physicochemical properties of Hsp90 proteins suggested that despite the overall structural conservation of these proteins, parasitic Hsp90 proteins have unique features which differentiate them from human ones, thus encouraging the idea that protozoan Hsp90 proteins should be further analyzed as potential drug targets. PMID:26793431

  17. Getting personalized cancer genome analysis into the clinic: the challenges in bioinformatics

    PubMed Central

    2012-01-01

    Progress in genomics has raised expectations in many fields, and particularly in personalized cancer research. The new technologies available make it possible to combine information about potential disease markers, altered function and accessible drug targets, which, coupled with pathological and medical information, will help produce more appropriate clinical decisions. The accessibility of such experimental techniques makes it all the more necessary to improve and adapt computational strategies to the new challenges. This review focuses on the critical issues associated with the standard pipeline, which includes: DNA sequencing analysis; analysis of mutations in coding regions; the study of genome rearrangements; extrapolating information on mutations to the functional and signaling level; and predicting the effects of therapies using mouse tumor models. We describe the possibilities, limitations and future challenges of current bioinformatics strategies for each of these issues. Furthermore, we emphasize the need for the collaboration between the bioinformaticians who implement the software and use the data resources, the computational biologists who develop the analytical methods, and the clinicians, the systems' end users and those ultimately responsible for taking medical decisions. Finally, the different steps in cancer genome analysis are illustrated through examples of applications in cancer genome analysis. PMID:22839973

  18. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines.

    PubMed

    D'Argenio, Valeria; Casaburi, Giorgio; Precone, Vincenza; Salvatore, Francesco

    2014-01-01

    Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times.

  19. Structural and Phylogenetic Analysis of Laccases from Trichoderma: A Bioinformatic Approach

    PubMed Central

    Cázares-García, Saila Viridiana; Vázquez-Garcidueñas, Ma. Soledad; Vázquez-Marrufo, Gerardo

    2013-01-01

    The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential. PMID:23383142

  20. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    NASA Astrophysics Data System (ADS)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  1. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    PubMed Central

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-01-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm. PMID:26879404

  2. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine.

    PubMed

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-16

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM's diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients' target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ's cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the "multi-component, multi-target and multi-pathway" combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM's molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  3. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine.

    PubMed

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-01-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM's diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients' target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ's cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the "multi-component, multi-target and multi-pathway" combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM's molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm. PMID:26879404

  4. Bioinformatics Analysis of Transcriptome Dynamics During Growth in Angus Cattle Longissimus Muscle

    PubMed Central

    Moisá, Sonia J.; Shike, Daniel W.; Graugnard, Daniel E.; Rodriguez-Zas, Sandra L.; Everts, Robin E.; Lewin, Harris A.; Faulkner, Dan B.; Berger, Larry L.; Loor, Juan J.

    2013-01-01

    Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth. PMID:23943656

  5. Bioinformatic tools for DNA/protein sequence analysis, functional assignment of genes and protein classification.

    PubMed

    Rehm, B H

    2001-12-01

    The development of efficient DNA sequencing methods has led to the achievement of the DNA sequence of entire genomes from (to date) 55 prokaryotes, 5 eukaryotic organisms and 10 eukaryotic chromosomes. Thus, an enormous amount of DNA sequence data is available and even more will be forthcoming in the near future. Analysis of this overwhelming amount of data requires bioinformatic tools in order to identify genes that encode functional proteins or RNA. This is an important task, considering that even in the well-studied Escherichia coli more than 30% of the identified open reading frames are hypothetical genes. Future challenges of genome sequence analysis will include the understanding of gene regulation and metabolic pathway reconstruction including DNA chip technology, which holds tremendous potential for biomedicine and the biotechnological production of valuable compounds. The overwhelming volume of information often confuses scientists. This review intends to provide a guide to choosing the most efficient way to analyze a new sequence or to collect information on a gene or protein of interest by applying current publicly available databases and Web services. Recently developed tools that allow functional assignment of genes, mainly based on sequence similarity of the deduced amino acid sequence, using the currently available and increasing biological databases will be discussed.

  6. Bioinformatic identification and expression analysis of Nelumbo nucifera microRNA and their targets1

    PubMed Central

    Pan, Lei; Wang, Xiaolei; Jin, Jing; Yu, Xiaolu; Hu, Jihong

    2015-01-01

    Premise of the study: Sacred lotus (Nelumbo nucifera) is a perennial aquatic herbaceous plant of ecological, ornamental, and economic importance. MicroRNAs (miRNAs) play an important role in plant development. However, reports of miRNAs and their role in sacred lotus have been limited. Methods: Using the homology search of known miRNAs with genome and transcriptome contig sequences, we employed a pipeline to identify miRNAs in N. nucifera. We also predicted the targets of these miRNAs. Results: We found 106 conserved miRNAs in N. nucifera, and 456 of their miRNA targets were annotated. Quantitative real-time PCR (qRT-PCR) analysis revealed the different expression levels of the 10 selected conserved miRNAs in tissues of young leaves, stems, and flowers of N. nucifera. Negative correlation of expression level between five miRNAs and their target genes was also revealed. Discussion: Combining bioinformatics and experiment analysis, we identified the miRNAs in N. nucifera. The results can be used as a workbench for further investigation of the roles of miRNAs in N. nucifera. PMID:26421251

  7. Bioinformatics in the information age

    SciTech Connect

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  8. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications

    PubMed Central

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-01-01

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron–Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods. PMID:27529225

  9. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications.

    PubMed

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-08-11

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure-Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron-Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods.

  10. Bioinformatics Analysis of the Effects of Tobacco Smoke on Gene Expression.

    PubMed

    Cao, Chunhua; Chen, Jianhua; Lyu, Chengqi; Yu, Jia; Zhao, Wei; Wang, Yi; Zou, Derong

    2015-01-01

    This study was designed to explore the effects of tobacco smoke on gene expression through bioinformatics analyses. Gene expression profile GSE17913 was downloaded from the Gene Expression Omnibus database. The differentially expressed genes (DEGs) in buccal mucosa tissues between 39 active smokers and 40 never smokers were identified. Gene Ontology Specifically, the DEG distribution in the pathway of Metabolism of xenobiotics by cytochrome P450 was shown in Fig 2[corrected] were performed, followed by protein-protein interaction (PPI) network, transcriptional regulatory network as well as miRNA-target regulatory network construction. In total, 88 up-regulated DEGs and 106 down-regulated DEGs were identified. Among these DEGs, cytochrome P450, family 1, subfamily A, polypeptide 1 (CYP1A1) and CYP1B1 were enriched in the Metabolism of xenobiotics by cytochrome P450 pathway. In the PPI network, tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, zeta (YWHAZ), and CYP1A1 were hub genes. In the transcriptional regulatory network, transcription factors of MYC associated factor X (MAX) and upstream transcription factor 1 (USF1) regulated many overlapped DEGs. In addition, protein tyrosine phosphatase, receptor type, D (PTPRD) was regulated by multiple miRNAs in the miRNA-DEG regulatory network. CYP1A1, CYP1B1, YWHAZ and PTPRD, and TF of MAX and USF1 may have the potential to be used as biomarkers and therapeutic targets in tobacco smoke-related pathological changes.

  11. Bioinformatics Analysis of the Effects of Tobacco Smoke on Gene Expression

    PubMed Central

    Cao, Chunhua; Chen, Jianhua; Lyu, Chengqi; Yu, Jia; Zhao, Wei; Wang, Yi; Zou, Derong

    2015-01-01

    This study was designed to explore the effects of tobacco smoke on gene expression through bioinformatics analyses. Gene expression profile GSE17913 was downloaded from the Gene Expression Omnibus database. The differentially expressed genes (DEGs) in buccal mucosa tissues between 39 active smokers and 40 never smokers were identified. Gene Ontology (GO) and pathway enrichment analyses of DEGs were performed, followed by protein-protein interaction (PPI) network, transcriptional regulatory network as well as miRNA-target regulatory network construction. In total, 88 up-regulated DEGs and 106 down-regulated DEGs were identified. Among these DEGs, cytochrome P450, family 1, subfamily A, polypeptide 1 (CYP1A1) and CYP1B1 were enriched in the Metabolism of xenobiotics by cytochrome P450 pathway. In the PPI network, tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, zeta (YWHAZ), and CYP1A1 were hub genes. In the transcriptional regulatory network, transcription factors of MYC associated factor X (MAX) and upstream transcription factor 1 (USF1) regulated many overlapped DEGs. In addition, protein tyrosine phosphatase, receptor type, D (PTPRD) was regulated by multiple miRNAs in the miRNA-DEG regulatory network. CYP1A1, CYP1B1, YWHAZ and PTPRD, and TF of MAX and USF1 may have the potential to be used as biomarkers and therapeutic targets in tobacco smoke-related pathological changes. PMID:26629988

  12. Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications.

    PubMed

    Pastur-Romay, Lucas Antón; Cedrón, Francisco; Pazos, Alejandro; Porto-Pazos, Ana Belén

    2016-01-01

    Over the past decade, Deep Artificial Neural Networks (DNNs) have become the state-of-the-art algorithms in Machine Learning (ML), speech recognition, computer vision, natural language processing and many other tasks. This was made possible by the advancement in Big Data, Deep Learning (DL) and drastically increased chip processing abilities, especially general-purpose graphical processing units (GPGPUs). All this has created a growing interest in making the most of the potential offered by DNNs in almost every field. An overview of the main architectures of DNNs, and their usefulness in Pharmacology and Bioinformatics are presented in this work. The featured applications are: drug design, virtual screening (VS), Quantitative Structure-Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining. The future need of neuromorphic hardware for DNNs is also discussed, and the two most advanced chips are reviewed: IBM TrueNorth and SpiNNaker. In addition, this review points out the importance of considering not only neurons, as DNNs and neuromorphic chips should also include glial cells, given the proven importance of astrocytes, a type of glial cell which contributes to information processing in the brain. The Deep Artificial Neuron-Astrocyte Networks (DANAN) could overcome the difficulties in architecture design, learning process and scalability of the current ML methods. PMID:27529225

  13. Cloning, expression and bioinformatics analysis of ATP sulfurylase from Acidithiobacillus ferrooxidans ATCC 23270 in Escherichia coli

    PubMed Central

    Jaramillo, Michael L; Abanto, Michel; Quispe, Ruth L; Calderón, Julio; del Valle, Luís J; Talledo, Miguel; Ramírez, Pablo

    2012-01-01

    Molecular studies of enzymes involved in sulfite oxidation in Acidithiobacillus ferrooxidans have not yet been developed, especially in the ATP sulfurylase (ATPS) of these acidophilus tiobacilli that have importance in biomining. This enzyme synthesizes ATP and sulfate from adenosine phosphosulfate (APS) and pyrophosphate (PPi), final stage of the sulfite oxidation by these organisms in order to obtain energy. The atpS gene (1674 bp) encoding the ATPS from Acidithiobacillus ferrooxidans ATCC 23270 was amplified using PCR, cloned in the pET101-TOPO plasmid, sequenced and expressed in Escherichia coli obtaining a 63.5 kDa ATPS recombinant protein according to SDS-PAGE analysis. The bioinformatics and phylogenetic analyses determined that the ATPS from A. ferrooxidans presents ATP sulfurylase (ATS) and APS kinase (ASK) domains similar to ATPS of Aquifex aeolicus, probably of a more ancestral origin. Enzyme activity towards ATP formation was determined by quantification of ATP formed from E. coli cell extracts, using a bioluminescence assay based on light emission by the luciferase enzyme. Our results demonstrate that the recombinant ATP sulfurylase from A. ferrooxidans presents an enzymatic activity for the formation of ATP and sulfate, and possibly is a bifunctional enzyme due to its high homology to the ASK domain from A. aeolicus and true kinases. PMID:23055613

  14. [Cloning and bioinformatic analysis of FatB genes in Lonicera japonica Thunb and its substitutes].

    PubMed

    Wang, Zhou-yong; Jiang, Chao; Chen, Min; Chen, Ping; Yuan, Yuan; Lin, Shu-fang; Wu, Zhi-gang

    2012-10-01

    A FatB unigene was obtained from the transcriptome dataset of Lonicera japonica Thunb. Full-length FatB cDNA was cloned from buds of Lonicera japonica Thunb., Lonicera japonica Thunb. var. chinensis (Wats.) Bak., Lonicera hypoglauca Miq. and Lonicera dasystyla Rehd. using RT-PCR technology, and named as LJFatB, LHFatB, LJCFatB and LDFatB. The results of bioinformatic analysis showed that LJFatB, LJCFatB, LHFatB and LDFatB and Arabidopsis thaliana AtFatB had a closely relationship. Nucleotide sequences and protein secondary structure of LJFatB, LJCFatB, LHFatB and LDFatB are different and their proteins had conserved FatB substrate binding sites and catalytic activity sites. Transcriptive level of LJFatB, LJCFatB, LHFatB and LDFatB in bud was not significantly different. Therefore, LJFatB, LJCFatB, LHFatB and LDFatB could have the same biological function as AtFatB.

  15. Bioinformatics approaches for structural and functional analysis of proteins in secondary metabolism in Withania somnifera.

    PubMed

    Sanchita; Singh, Swati; Sharma, Ashok

    2014-11-01

    Withania somnifera (Ashwagandha) is an affluent storehouse of large number of pharmacologically active secondary metabolites known as withanolides. These secondary metabolites are produced by withanolide biosynthetic pathway. Very less information is available on structural and functional aspects of enzymes involved in withanolides biosynthetic pathways of Withiana somnifera. We therefore performed a bioinformatics analysis to look at functional and structural properties of these important enzymes. The pathway enzymes taken for this study were 3-Hydroxy-3-methylglutaryl coenzyme A reductase, 1-Deoxy-D-xylulose-5-phosphate synthase, 1-Deoxy-D-xylulose-5-phosphate reductase, farnesyl pyrophosphate synthase, squalene synthase, squalene epoxidase, and cycloartenol synthase. The prediction of secondary structure was performed for basic structural information. Three-dimensional structures for these enzymes were predicted. The physico-chemical properties such as pI, AI, GRAVY and instability index were also studied. The current information will provide a platform to know the structural attributes responsible for the function of these protein until experimental structures become available.

  16. Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis

    PubMed Central

    Sugimoto, Masahiro; Kawakami, Masato; Robert, Martin; Soga, Tomoyoshi; Tomita, Masaru

    2012-01-01

    Biological systems are increasingly being studied in a holistic manner, using omics approaches, to provide quantitative and qualitative descriptions of the diverse collection of cellular components. Among the omics approaches, metabolomics, which deals with the quantitative global profiling of small molecules or metabolites, is being used extensively to explore the dynamic response of living systems, such as organelles, cells, tissues, organs and whole organisms, under diverse physiological and pathological conditions. This technology is now used routinely in a number of applications, including basic and clinical research, agriculture, microbiology, food science, nutrition, pharmaceutical research, environmental science and the development of biofuels. Of the multiple analytical platforms available to perform such analyses, nuclear magnetic resonance and mass spectrometry have come to dominate, owing to the high resolution and large datasets that can be generated with these techniques. The large multidimensional datasets that result from such studies must be processed and analyzed to render this data meaningful. Thus, bioinformatics tools are essential for the efficient processing of huge datasets, the characterization of the detected signals, and to align multiple datasets and their features. This paper provides a state-of-the-art overview of the data processing tools available, and reviews a collection of recent reports on the topic. Data conversion, pre-processing, alignment, normalization and statistical analysis are introduced, with their advantages and disadvantages, and comparisons are made to guide the reader. PMID:22438836

  17. Is there room for ethics within bioinformatics education?

    PubMed

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  18. Basics of Genome Sequence Analysis in Bioinformatics -- its Fundamental Ideas and Problems

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2009-02-01

    The genome sequences are one of the most fundamental data among various omics analyses. So far, basic bioinformatics tools have developing to treat genome sequences. First step of genome sequence analysis is to predict or assign "genes" on genome sequences. In the case of Eukaryotes, we can identify genes by use of full length cDNA sequences with local alignment tools such as search, blast and fasta, etc. However, it is difficult to catch mRNAs (transcripts) in Prokaryotes. Therefore, computational prediction for gene identification is first choice to start genome sequence analysis. In this review, we pick up methods for computational gene prediction first. Once genes are predicted, next step is to functions for proteins or RNAs encoded on a gene. Then, how we can define the distance between gene sequences is very important for the further analysis. So, we describe the basics of mathematical concept for gene comparison. And we also introduce our novel concept for biological sequence comparisons for the view point of informational theory. In the post genome era, many researchers are very interested in not only gene functions but also the gene regulations whose information is also on genome sequences. Cis-regulatory elements, however, is too short to find some mathematical rules. Therefore, computationally predicted cis-elements tend to include many false-positives. To reduce the ratio false-positives, we need reliable database of set of cis-regulatory elements called cis-regulatory modules for a gene. So, we are trying to develop the Cis-Regulatory Elements Module Reference Database. In the third section, we introduce you the procedure to construct the Cis-Regulatory Elements Module Reference Database and its user interfaces.

  19. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics.

  20. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  1. Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes

    PubMed Central

    Rashid, Mamunur; Robles-Espinoza, Carla Daniela; Rust, Alistair G.; Adams, David J.

    2013-01-01

    Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application. Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ Contact: da1@sanger.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23803469

  2. Functional analysis of the mRNA profile of neutrophil gelatinase-associated lipocalin overexpression in esophageal squamous cell carcinoma using multiple bioinformatic tools

    PubMed Central

    WU, BING-LI; LI, CHUN-QUAN; DU, ZE-PENG; ZHOU, FEI; XIE, JIAN-JUN; LUO, LIE-WEI; WU, JIAN-YI; ZHANG, PI-XIAN; XU, LI-YAN; LI, EN-MIN

    2014-01-01

    Neutrophil gelatinase-associated lipocalin (NGAL) is a member of the lipocalin superfamily; dysregulated expression of NGAL has been observed in several benign and malignant diseases. In the present study, differentially expressed genes, in comparison with those of control cells, in the mRNA expression profile of EC109 esophageal squamous cell carcinoma (ESCC) cells following NGAL overexpression were analyzed by multiple bioinformatic tools for a comprehensive understanding. A total of 29 gene ontology (GO) terms associated with immune function, chromatin structure and gene transcription were identified among the differentially expressed genes (DEGs) in NGAL overexpressing cells. In addition to the detected GO categories, the results from the functional annotation chart revealed that the differentially expressed genes were also associated with 101 functional annotation category terms. A total of 59 subpathways associated locally with the differentially expressed genes were identified by subpathway analysis, a markedly greater total that detected by traditional pathway enrichment analysis only. Promoter analysis indicated that the potential transcription factors Snail, deltaEF1, Mycn, Arnt, MNB1A, PBF, E74A, Ubx, SPI1 and GATA2 were unique to the downregulated DEG promoters, while bZIP910, ZNF42 and SOX9 were unique for the upregulated DEG promoters. In conclusion, the understanding of the role of NGAL overexpression in ESCC has been improved through the present bioinformatic analysis. PMID:25109818

  3. Neuroinformatics: from bioinformatics to databasing the brain.

    PubMed

    Morse, Thomas M

    2008-01-01

    Neuroinformatics seeks to create and maintain web-accessible databases of experimental and computational data, together with innovative software tools, essential for understanding the nervous system in its normal function and in neurological disorders. Neuroinformatics includes traditional bioinformatics of gene and protein sequences in the brain; atlases of brain anatomy and localization of genes and proteins; imaging of brain cells; brain imaging by positron emission tomography (PET), functional magnetic resonance imaging (fMRI), electroencephalography (EEG), magnetoencephalography (MEG) and other methods; many electrophysiological recording methods; and clinical neurological data, among others. Building neuroinformatics databases and tools presents difficult challenges because they span a wide range of spatial scales and types of data stored and analyzed. Traditional bioinformatics, by comparison, focuses primarily on genomic and proteomic data (which of course also presents difficult challenges). Much of bioinformatics analysis focus on sequences (DNA, RNA, and protein molecules), as the type of data that are stored, compared, and sometimes modeled. Bioinformatics is undergoing explosive growth with the addition, for example, of databases that catalog interactions between proteins, of databases that track the evolution of genes, and of systems biology databases which contain models of all aspects of organisms. This commentary briefly reviews neuroinformatics with clarification of its relationship to traditional and modern bioinformatics.

  4. [Phylogenetic and Bioinformatics Analysis of Replicase Gene Sequence of Cucumber Green Mottle Mosaic Virus].

    PubMed

    Liang, Chaoqiong; Meng, Yan; Luo, Laixin; Liu, Pengfei; Li, Jianqiang

    2015-11-01

    kD proteins of tested CGMMV isolates. The current results that there was no significant difference between the replicase gene sequences, it was stable and conservative for intra-species and clearly difference for inter-species. CGMMV-No. 1, CGMMV-No. 3, CGMMV-No. 4 and CGMMV-No. 5 had. a close genetic relationship with Shandong and Liangning isolates (Accession No. KJ754195 and EF611826), they are potentially originate from the same source. CGMMV-No. 2 was closer with Korea isolate. High sequence similarity of tested samples were gathered for a class in phylogenetic tree. It didn't show regularity of the bioinformatics analysis results of 129 kD and 57 kD proteins of tested CGMMV isolates. There was no corresponding relationship among the molecular phylogeny and the bioinformatics analysis of the tested CGMMV isolates. PMID:26951006

  5. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    PubMed Central

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi. PMID:27507959

  6. Identification of MicroRNAs from Eugenia uniflora by High-Throughput Sequencing and Bioinformatics Analysis

    PubMed Central

    Guzman, Frank; Almerão, Mauricio P.; Körbes, Ana P.; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    Background microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Results Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. Conclusions This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs. PMID:23166775

  7. Bioinformatic and metabolomic analysis reveals miR-155 regulates thiamine level in breast cancer.

    PubMed

    Kim, Sinae; Rhee, Je-keun; Yoo, Hyun Ju; Lee, Hee Jin; Lee, Eun Ji; Lee, Jong Won; Yu, Jong Han; Son, Byung Ho; Gong, Gyungyup; Kim, Sung Bae; Singh, Shree Ram; Ahn, Sei Hyun; Chang, Suhwan

    2015-02-28

    microRNA-155 (miR-155) is one of the well-known oncogenic miRNA implicated in various types of tumors. Thiamine, commonly known as vitamin B1, is one of critical cofactors for energy metabolic enzymes including pyruvate dehydrogenase, alpha ketoglutarate dehydrogenase, and transketolase. Here we report a novel role of miR-155 in cancer metabolism through the up-regulation of thiamine in breast cancer cells. A bioinformatic analysis of miRNA array and metabolite-profiling data from NCI-60 cancer cell panel revealed thiamine as a metabolite positively correlated with the miR-155 expression level. We confirmed it in MCF7, MDA-MB-436 and two human primary breast cancer cells by showing reduced thiamine levels upon a knock-down of miR-155. To understand how the miR-155 controls thiamine level, a set of key molecules for thiamine homeostasis were further analyzed after the knockdown of miR-155. The results showed the expression of two thiamine transporter genes (SLC19A2, SLC25A19) as well as thiamine pyrophosphokinase-1 (TPK1) were decreased in both RNA and protein level in miR-155 dependent manner. Finally, we confirm the finding by showing a positive correlation between miR-155 and thiamine level in 71 triple negative breast tumors. Taken altogether, our study demonstrates a role of miR-155 in thiamine homeostasis and suggests a function of this oncogenic miRNA on breast cancer metabolism.

  8. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    ERIC Educational Resources Information Center

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  9. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    PubMed Central

    Nielsen, Sofie V.; Lindorff-Larsen, Kresten; Hartmann-Petersen, Rasmus

    2016-01-01

    The ubiquitin-proteasome system targets misfolded proteins for degradation. Since the accumulation of such proteins is potentially harmful for the cell, their prompt removal is important. E3 ubiquitin-protein ligases mediate substrate ubiquitination by bringing together the substrate with an E2 ubiquitin-conjugating enzyme, which transfers ubiquitin to the substrate. For misfolded proteins, substrate recognition is generally delegated to molecular chaperones that subsequently interact with specific E3 ligases. An important exception is San1, a yeast E3 ligase. San1 harbors extensive regions of intrinsic disorder, which provide both conformational flexibility and sites for direct recognition of misfolded targets of vastly different conformations. So far, no mammalian ortholog of San1 is known, nor is it clear whether other E3 ligases utilize disordered regions for substrate recognition. Here, we conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology of their ordered regions, and did not capture the unique disorder patterns that encode the functional mechanism of San1. However, by searching specifically for key features of the San1 sequence, such as long regions of intrinsic disorder embedded with short stretches predicted to be suitable for substrate interaction, we identified several E3 ligases with these characteristics. Our initial analysis revealed that another remarkable trait of San1 is shared with several candidate E3 ligases: long stretches of complete lysine suppression, which in San1 limits auto-ubiquitination. We encode these characteristic features into a San1 similarity-score, and present a set of proteins that are plausible candidates as San1 counterparts in humans. In conclusion, our work indicates that San1 is

  10. No-boundary thinking in bioinformatics research.

    PubMed

    Huang, Xiuzhen; Bruce, Barry; Buchan, Alison; Congdon, Clare Bates; Cramer, Carole L; Jennings, Steven F; Jiang, Hongmei; Li, Zenglu; McClure, Gail; McMullen, Rick; Moore, Jason H; Nanduri, Bindu; Peckham, Joan; Perkins, Andy; Polson, Shawn W; Rekepalli, Bhanu; Salem, Saeed; Specker, Jennifer; Wunsch, Donald; Xiong, Donghai; Zhang, Shuzhong; Zhao, Zhongming

    2013-11-06

    Currently there are definitions from many agencies and research societies defining "bioinformatics" as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT).

  11. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

    PubMed

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

    2015-04-01

    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  12. Current opportunities and challenges in microbial metagenome analysis--a bioinformatic perspective.

    PubMed

    Teeling, Hanno; Glöckner, Frank Oliver

    2012-11-01

    Metagenomics has become an indispensable tool for studying the diversity and metabolic potential of environmental microbes, whose bulk is as yet non-cultivable. Continual progress in next-generation sequencing allows for generating increasingly large metagenomes and studying multiple metagenomes over time or space. Recently, a new type of holistic ecosystem study has emerged that seeks to combine metagenomics with biodiversity, meta-expression and contextual data. Such 'ecosystems biology' approaches bear the potential to not only advance our understanding of environmental microbes to a new level but also impose challenges due to increasing data complexities, in particular with respect to bioinformatic post-processing. This mini review aims to address selected opportunities and challenges of modern metagenomics from a bioinformatics perspective and hopefully will serve as a useful resource for microbial ecologists and bioinformaticians alike.

  13. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  14. Virus Pathogen Database and Analysis Resource (ViPR): A Comprehensive Bioinformatics Database and Analysis Resource for the Coronavirus Research Community

    PubMed Central

    Pickett, Brett E.; Greer, Douglas S.; Zhang, Yun; Stewart, Lucy; Zhou, Liwei; Sun, Guangyu; Gu, Zhiping; Kumar, Sanjeev; Zaremba, Sam; Larsen, Christopher N.; Jen, Wei; Klem, Edward B.; Scheuermann, Richard H.

    2012-01-01

    Several viruses within the Coronaviridae family have been categorized as either emerging or re-emerging human pathogens, with Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) being the most well known. The NIAID-sponsored Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org) supports bioinformatics workflows for a broad range of human virus pathogens and other related viruses, including the entire Coronaviridae family. ViPR provides access to sequence records, gene and protein annotations, immune epitopes, 3D structures, host factor data, and other data types through an intuitive web-based search interface. Records returned from these queries can then be subjected to web-based analyses including: multiple sequence alignment, phylogenetic inference, sequence variation determination, BLAST comparison, and metadata-driven comparative genomics statistical analysis. Additional tools exist to display multiple sequence alignments, view phylogenetic trees, visualize 3D protein structures, transfer existing reference genome annotations to new genomes, and store or share results from any search or analysis within personal private ‘Workbench’ spaces for future access. All of the data and integrated analysis and visualization tools in ViPR are made available without charge as a service to the Coronaviridae research community to facilitate the research and development of diagnostics, prophylactics, vaccines and therapeutics against these human pathogens. PMID:23202522

  15. Towards understanding the lifespan extension by reduced insulin signaling: bioinformatics analysis of DAF-16/FOXO direct targets in Caenorhabditis elegans

    PubMed Central

    Li, Yan-Hui; Zhang, Gai-Gai

    2016-01-01

    DAF-16, the C. elegans FOXO transcription factor, is an important determinant in aging and longevity. In this work, we manually curated FOXODB http://lyh.pkmu.cn/foxodb/, a database of FOXO direct targets. It now covers 208 genes. Bioinformatics analysis on 109 DAF-16 direct targets in C. elegans found interesting results. (i) DAF-16 and transcription factor PQM-1 co-regulate some targets. (ii) Seventeen targets directly regulate lifespan. (iii) Four targets are involved in lifespan extension induced by dietary restriction. And (iv) DAF-16 direct targets might play global roles in lifespan regulation. PMID:27027346

  16. Towards understanding the lifespan extension by reduced insulin signaling: bioinformatics analysis of DAF-16/FOXO direct targets in Caenorhabditis elegans.

    PubMed

    Li, Yan-Hui; Zhang, Gai-Gai

    2016-04-12

    DAF-16, the C. elegans FOXO transcription factor, is an important determinant in aging and longevity. In this work, we manually curated FOXODB http://lyh.pkmu.cn/foxodb/, a database of FOXO direct targets. It now covers 208 genes. Bioinformatics analysis on 109 DAF-16 direct targets in C. elegans found interesting results. (i) DAF-16 and transcription factor PQM-1 co-regulate some targets. (ii) Seventeen targets directly regulate lifespan. (iii) Four targets are involved in lifespan extension induced by dietary restriction. And (iv) DAF-16 direct targets might play global roles in lifespan regulation.

  17. Identification of FHL2-regulated genes in liver by microarray and bioinformatics analysis.

    PubMed

    Ng, Chor-Fung; Xu, Jia-Ying; Li, Man-Shan; Tsui, Stephen Kwok-Wing

    2014-04-01

    FHL2 is a LIM domain protein that is able to form various protein complexes and regulate gene transcription. Recent findings showed that FHL2 is a potential tumor suppressor gene that was down-regulated in hepatocellular carcinoma. In the present study, microarray profiling of gene expression was performed to identify the genes regulated by FHL2 in mouse livers. The differentially expressed genes were further analyzed by bioinformatics tools including DAVID, KEGG, and STRING. Our data illustrate that FHL2 affects genes involved in various functions including signal transduction, responses to external stimulus, cancer-related pathways, cardiovascular function and regulation of actin cytoskeleton. Moreover, a network of differentially expressed genes identified in this study and known FHL2-interacting proteins was constructed. Then, genes identified by bioinformatics tools and most functional relevant to FHL2 were selected for further validation. Finally, the differential expression of Ar, Id3, Inhbe, Alas1, Bcl6, Pparδ, Angptl4, and Erbb4 were confirmed by quantitative real-time PCR. In summary, we have established a database of genes that are potentially regulated by FHL2 and these genes should be future targets for the elucidation of functional roles of FHL2.

  18. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    PubMed Central

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (<10%) of studies provided any reliability or validity evidence, and only one study out of the 95 sampled mentioned both validity and reliability. Our findings raise concerns about the quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE. PMID:24006400

  19. Dysregulation of TFDP1 and of the cell cycle pathway in high-grade glioblastoma multiforme: a bioinformatic analysis.

    PubMed

    Lu, X; Lv, X D; Ren, Y H; Yang, W D; Li, Z B; Zhang, L; Bai, X F

    2016-01-01

    Despite extensive research, the prognosis of high-grade glioblastoma multiforme (GBM) has improved only slightly because of the limited response to standard treatments. Recent advances (discoveries of molecular biomarkers) provide new opportunities for the treatment of GBM. The aim of the present study was to identify diagnostic biomarkers of high-grade GBM. First, we combined 3 microarray expression datasets to screen them for genes differentially expressed in patients with high-grade GBM relative to healthy subjects. Next, the target network was constructed via the empirical Bayesian coexpression approach, and centrality analysis and a molecular complex detection (MCODE) algorithm were performed to explore hub genes and functional modules. Finally, a validation test was conducted to verify the bioinformatic results. A total of 277 differentially expressed genes were identified according to the criteria P < 0.05 and |log2(fold change)| ≥ 1.5. These genes were most significantly enriched in the cell cycle pathway. Centrality analysis uncovered 9 hub genes; among them, TFDP1 showed the highest degree of connectivity (43) and is a known participant in the cell cycle pathway; this finding pointed to the important role of TFDP1 in the progression of high-grade GBM. Experimental validation mostly supported the bioinformatic results. According to our study results, the gene TFDP1 and the cell cycle pathway are strongly associated with high-grade GBM; this result may provide new insights into the pathogenesis of GBM. PMID:27323154

  20. Bioinformatics meets clinical informatics.

    PubMed

    Smith, Jeremy; Protti, Denis

    2005-01-01

    The field of bioinformatics has exploded over the past decade. Hopes have run high for the impact on preventive, diagnostic, and therapeutic capabilities of genomics and proteomics. As time has progressed, so has our understanding of this field. Although the mapping of the human genome will certainly have an impact on health care, it is a complex web to unweave. Addressing simpler "Single Nucleotide Polymorphisms" (SNPs) is not new, however, the complexity and importance of polygenic disorders and the greater role of the far more complex field of proteomics has become more clear. Proteomics operates much closer to the actual cellular level of human structure and proteins are very sensitive markers of health. Because the proteome, however, is so much more complex than the genome, and changes with time and environmental factors, mapping it and using the data in direct care delivery is even harder than for the genome. For these reasons of complexity, the expected utopia of a single gene chip or protein chip capable of analyzing an individual's genetic make-up and producing a cornucopia of useful diagnostic information appears still a distant hope. When, and if, this happens, perhaps a genetic profile of each individual will be stored with their medical record; however, in the mean time, this type of information is unlikely to prove highly useful on a broad scale. To address the more complex "polygenic" diseases and those related to protein variations, other tools will be developed in the shorter term. "Top-down" analysis of populations and diseases is likely to produce earlier wins in this area. Detailed computer-generated models will map a wide array of human and environmental factors that indicate the presence of a disease or the relative impact of a particular treatment. These models may point to an underlying genomic or proteomic cause, for which genomic or proteomic testing or therapies could then be applied for confirmation and/or treatment. These types of

  1. GProX, a user-friendly platform for bioinformatics analysis and visualization of quantitative proteomics data.

    PubMed

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-08-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.

  2. Integration of bioinformatics to biodegradation

    PubMed Central

    2014-01-01

    Bioinformatics and biodegradation are two primary scientific fields in applied microbiology and biotechnology. The present review describes development of various bioinformatics tools that may be applied in the field of biodegradation. Several databases, including the University of Minnesota Biocatalysis/Biodegradation database (UM-BBD), a database of biodegradative oxygenases (OxDBase), Biodegradation Network-Molecular Biology Database (Bionemo) MetaCyc, and BioCyc have been developed to enable access to information related to biochemistry and genetics of microbial degradation. In addition, several bioinformatics tools for predicting toxicity and biodegradation of chemicals have been developed. Furthermore, the whole genomes of several potential degrading bacteria have been sequenced and annotated using bioinformatics tools. PMID:24808763

  3. Bioinformatics Knowledge Map for Analysis of Beta-Catenin Function in Cancer

    PubMed Central

    Arighi, Cecilia N.; Wu, Cathy H.

    2015-01-01

    Given the wealth of bioinformatics resources and the growing complexity of biological information, it is valuable to integrate data from disparate sources to gain insight into the role of genes/proteins in health and disease. We have developed a bioinformatics framework that combines literature mining with information from biomedical ontologies and curated databases to create knowledge “maps” of genes/proteins of interest. We applied this approach to the study of beta-catenin, a cell adhesion molecule and transcriptional regulator implicated in cancer. The knowledge map includes post-translational modifications (PTMs), protein-protein interactions, disease-associated mutations, and transcription factors co-activated by beta-catenin and their targets and captures the major processes in which beta-catenin is known to participate. Using the map, we generated testable hypotheses about beta-catenin biology in normal and cancer cells. By focusing on proteins participating in multiple relation types, we identified proteins that may participate in feedback loops regulating beta-catenin transcriptional activity. By combining multiple network relations with PTM proteoform-specific functional information, we proposed a mechanism to explain the observation that the cyclin dependent kinase CDK5 positively regulates beta-catenin co-activator activity. Finally, by overlaying cancer-associated mutation data with sequence features, we observed mutation patterns in several beta-catenin PTM sites and PTM enzyme binding sites that varied by tissue type, suggesting multiple mechanisms by which beta-catenin mutations can contribute to cancer. The approach described, which captures rich information for molecular species from genes and proteins to PTM proteoforms, is extensible to other proteins and their involvement in disease. PMID:26509276

  4. Bioinformatics Knowledge Map for Analysis of Beta-Catenin Function in Cancer.

    PubMed

    Çelen, İrem; Ross, Karen E; Arighi, Cecilia N; Wu, Cathy H

    2015-01-01

    Given the wealth of bioinformatics resources and the growing complexity of biological information, it is valuable to integrate data from disparate sources to gain insight into the role of genes/proteins in health and disease. We have developed a bioinformatics framework that combines literature mining with information from biomedical ontologies and curated databases to create knowledge "maps" of genes/proteins of interest. We applied this approach to the study of beta-catenin, a cell adhesion molecule and transcriptional regulator implicated in cancer. The knowledge map includes post-translational modifications (PTMs), protein-protein interactions, disease-associated mutations, and transcription factors co-activated by beta-catenin and their targets and captures the major processes in which beta-catenin is known to participate. Using the map, we generated testable hypotheses about beta-catenin biology in normal and cancer cells. By focusing on proteins participating in multiple relation types, we identified proteins that may participate in feedback loops regulating beta-catenin transcriptional activity. By combining multiple network relations with PTM proteoform-specific functional information, we proposed a mechanism to explain the observation that the cyclin dependent kinase CDK5 positively regulates beta-catenin co-activator activity. Finally, by overlaying cancer-associated mutation data with sequence features, we observed mutation patterns in several beta-catenin PTM sites and PTM enzyme binding sites that varied by tissue type, suggesting multiple mechanisms by which beta-catenin mutations can contribute to cancer. The approach described, which captures rich information for molecular species from genes and proteins to PTM proteoforms, is extensible to other proteins and their involvement in disease. PMID:26509276

  5. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  6. Bioinformatics for personal genome interpretation.

    PubMed

    Capriotti, Emidio; Nehrt, Nathan L; Kann, Maricel G; Bromberg, Yana

    2012-07-01

    An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.

  7. Crowdsourcing for bioinformatics

    PubMed Central

    Good, Benjamin M.; Su, Andrew I.

    2013-01-01

    Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614

  8. Prenatal alcohol exposure alters gene expression in the rat brain: Experimental design and bioinformatic analysis of microarray data.

    PubMed

    Lussier, Alexandre A; Stepien, Katarzyna A; Weinberg, Joanne; Kobor, Michael S

    2015-09-01

    We previously identified gene expression changes in the prefrontal cortex and hippocampus of rats prenatally exposed to alcohol under both steady-state and challenge conditions (Lussier et al., 2015, Alcohol.: Clin. Exp. Res., 39, 251-261). In this study, adult female rats from three prenatal treatment groups (ad libitum-fed control, pair-fed, and ethanol-fed) were injected with physiological saline solution or complete Freund׳s adjuvant (CFA) to induce arthritis (adjuvant-induced arthritis, AA). The prefrontal cortex and hippocampus were collected 16 days (peak of arthritis) or 39 days (during recovery) following injection, and whole genome gene expression was assayed using Illumina׳s RatRef-12 expression microarray. Here, we provide additional metadata, detailed explanations of data pre-processing steps and quality control, as well as a basic framework for the bioinformatic analyses performed. The datasets from this study are publicly available on the GEO repository (accession number GSE63561). PMID:26217797

  9. Prenatal alcohol exposure alters gene expression in the rat brain: Experimental design and bioinformatic analysis of microarray data

    PubMed Central

    Lussier, Alexandre A.; Stepien, Katarzyna A.; Weinberg, Joanne; Kobor, Michael S.

    2015-01-01

    We previously identified gene expression changes in the prefrontal cortex and hippocampus of rats prenatally exposed to alcohol under both steady-state and challenge conditions (Lussier et al., 2015, Alcohol.: Clin. Exp. Res., 39, 251–261). In this study, adult female rats from three prenatal treatment groups (ad libitum-fed control, pair-fed, and ethanol-fed) were injected with physiological saline solution or complete Freund׳s adjuvant (CFA) to induce arthritis (adjuvant-induced arthritis, AA). The prefrontal cortex and hippocampus were collected 16 days (peak of arthritis) or 39 days (during recovery) following injection, and whole genome gene expression was assayed using Illumina׳s RatRef-12 expression microarray. Here, we provide additional metadata, detailed explanations of data pre-processing steps and quality control, as well as a basic framework for the bioinformatic analyses performed. The datasets from this study are publicly available on the GEO repository (accession number GSE63561). PMID:26217797

  10. Molecular characterization and bioinformatics analysis of Ncoa7B, a novel ovulation-associated and reproduction system-specific Ncoa7 isoform.

    PubMed

    Shkolnik, Ketty; Ben-Dor, Shifra; Galiani, Dalia; Hourvitz, Ariel; Dekel, Nava

    2008-03-01

    In the present work, we employed bioinformatics search tools to select ovulation-associated cDNA clones with a preference for those representing putative novel genes. Detailed characterization of one of these transcripts, 6C3, by real-time PCR and RACE analyses led to identification of a novel ovulation-associated gene, designated Ncoa7B. This gene was found to exhibit a significant homology to the Ncoa7 gene that encodes a conserved tissue-specific nuclear receptor coactivator. Unlike Ncoa7, Ncoa7B possesses a unique and highly conserved exon at the 5' end and encodes a protein with a unique N-terminal sequence. Extensive bioinformatics analysis has revealed that Ncoa7B has one identifiable domain, TLDc, which has recently been suggested to be involved in protection from oxidative DNA damage. An alignment of TLDc domain containing proteins was performed, and the closest relative identified was OXR1, which also has a corresponding, highly related short isoform, with just a TLDc domain. Moreover, Ncoa7B expression, as seen to date, seems to be restricted to mammals, while other TLDc family members have no such restriction. Multiple tissue analysis revealed that unlike Ncoa7, which was abundant in a variety of tissues with the highest expression in the brain, Ncoa7B mRNA expression is restricted to the reproductive system organs, particularly the uterus and the ovary. The ovarian expression of Ncoa7B was stimulated by human chorionic gonadotropin. Additionally, using real-time PCR, we demonstrated the involvement of multiple signaling pathways for Ncoa7B expression on preovulatory follicles. PMID:18299425

  11. Bioinformatics analysis of microRNA and putative target genes in bovine mammary tissue infected with Streptococcus uberis.

    PubMed

    Naeem, A; Zhong, K; Moisá, S J; Drackley, J K; Moyes, K M; Loor, J J

    2012-11-01

    MicroRNA (miRNA) are small single-stranded noncoding RNA with important roles in regulating innate immunity in nonruminants via transcriptional and posttranscriptional mechanisms. Mastitis causes significant losses in the dairy industry and a wealth of large-scale mRNA expression data from mammary tissue have provided fundamental insights into the tissue adaptations to pathogens. We studied the expression of 14 miRNA (miR-10a, -15b, -16a, -17, -21, -31, -145, -146a, -146b, -155, -181a, -205, -221, and -223) associated with regulation of innate immunity and mammary epithelial cell function in tissue challenged with Streptococcus uberis. Those data, along with microarray expression of 2,102 differentially expressed genes, were used for bioinformatics analysis to uncover putative target genes and the most affected biological pathways and functions. Three miRNA (181a, 16, and 31) were downregulated approximately 3- to 5-fold and miR-223 was upregulated approximately 2.5-fold in infected versus healthy tissue. Among differentially expressed genes due to infection, bioinformatics analysis revealed that the studied miRNA share in the regulation of a large number of metabolic (SCD, CD36, GPAM, and FASN), immune/oxidative stress (TNF, IL6, IL10, SOD2, LYZ, and TLR4), and cellular proliferation/differentiation (FOS and CASP4) target genes. This level of complex regulation was underscored by the coordinate effect revealed by bioinformatics on various cellular pathways within the Kyoto Encyclopedia of Genes and Genomes database. Most pathways associated with "cellular processes," "organismal systems," and "diseases" were activated by putative target genes of miR-31 and miR-16a, with an overlapping activation of "immune system" and "signal transduction." A pronounced effect and activation of miR-31 target genes was observed within "folding, sorting, and degradation," "cell growth and death," and "cell communication" pathways, whereas a marked inhibition of "lipid metabolism

  12. Emerging bioinformatics approaches for analysis of NGS-derived coding and non-coding RNAs in neurodegenerative diseases

    PubMed Central

    Guffanti, Alessandro; Simchovitz, Alon; Soreq, Hermona

    2014-01-01

    Neurodegenerative diseases in general and specifically late-onset Alzheimer’s disease (LOAD) involve a genetically complex and largely obscure ensemble of causative and risk factors accompanied by complex feedback responses. The advent of “high-throughput” transcriptome investigation technologies such as microarray and deep sequencing is increasingly being combined with sophisticated statistical and bioinformatics analysis methods complemented by knowledge-based approaches such as Bayesian Networks or network and graph analyses. Together, such “integrative” studies are beginning to identify co-regulated gene networks linked with biological pathways and potentially modulating disease predisposition, outcome, and progression. Specifically, bioinformatics analyses of integrated microarray and genotyping data in cases and controls reveal changes in gene expression of both protein-coding and small and long regulatory RNAs; highlight relevant quantitative transcriptional differences between LOAD and non-demented control brains and demonstrate reconfiguration of functionally meaningful molecular interaction structures in LOAD. These may be measured as changes in connectivity in “hub nodes” of relevant gene networks (Zhang etal., 2013). We illustrate here the open analytical questions in the transcriptome investigation of neurodegenerative disease studies, proposing “ad hoc” strategies for the evaluation of differential gene expression and hints for a simple analysis of the non-coding RNA (ncRNA) part of such datasets. We then survey the emerging role of long ncRNAs (lncRNAs) in the healthy and diseased brain transcriptome and describe the main current methods for computational modeling of gene networks. We propose accessible modular and pathway-oriented methods and guidelines for bioinformatics investigations of whole transcriptome next generation sequencing datasets. We finally present methods and databases for functional interpretations of lncRNAs and

  13. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    PubMed

    Kim, Jinkyu; Kim, Gunn; An, Sungbae; Kwon, Young-Kyun; Yoon, Sungroh

    2013-01-01

    The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs) between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  14. Bioinformatics analysis and construction of phylogenetic tree of aquaporins from Echinococcus granulosus.

    PubMed

    Wang, Fen; Ye, Bin

    2016-09-01

    Cyst echinococcosis caused by the matacestodal larvae of Echinococcus granulosus (Eg), is a chronic, worldwide, and severe zoonotic parasitosis. The treatment of cyst echinococcosis is still difficult since surgery cannot fit the needs of all patients, and drugs can lead to serious adverse events as well as resistance. The screen of target proteins interacted with new anti-hydatidosis drugs is urgently needed to meet the prevailing challenges. Here, we analyzed the sequences and structure properties, and constructed a phylogenetic tree by bioinformatics methods. The MIP family signature and Protein kinase C phosphorylation sites were predicted in all nine EgAQPs. α-helix and random coil were the main secondary structures of EgAQPs. The numbers of transmembrane regions were three to six, which indicated that EgAQPs contained multiple hydrophobic regions. A neighbor-joining tree indicated that EgAQPs were divided into two branches, seven EgAQPs formed a clade with AQP1 from human, a "strict" aquaporins, other two EgAQPs formed a clade with AQP9 from human, an aquaglyceroporins. Unfortunately, homology modeling of EgAQPs was aborted. These results provide a foundation for understanding and researches of the biological function of E. granulosus. PMID:27164831

  15. [Cloning and bioinformatics analysis of SLA-DR genes in Hunan Shaziling pigs].

    PubMed

    Tang, Yi-Ya; Xing, Xiao-Wei; Xue, Li-Qun; Huang, Sheng-Qiang; Wang, Wei

    2007-12-01

    In order to clone class II DRA and DRB genes of swine leukocyte antigen (SLA) in Hunan Shaziling pigs, to analyze their characteristics and polymorphism and to provide immunological basic parameters for xenotransplantation from pigs to humans. SLA-DRA and SLA-DRB genes in two Shaziling pigs with the absence of porcine endogenous retrovirus (PERV) env-c were amplified by RT-PCR, cloned into PUCm-T vectors, sequenced and analyzed through BLAST in NCBI and related software in ExPASY. The obtained SLA-DRA and SLA-DRB genes of Shaziling pigs were 1,177 and 909 nucleotides in length with their accession numbers in Genbank as EF143987 and EF143988. Bioinformatics analyses have shown that they both contain opening reading frame (ORF) and encode 252 and 266 amino acids respectively. Comparing the ORF and protein sequences of the Shaziling SLA-DRA and SLA-DRB genes with their counterpart sequences of human, the homologies of nucleotide sequences were 83% and 83%, and the homologies of amino acid sequences 83 % and 79% respectively. Further comparison with SLA sequences published in GenBank indicated that SLA-DRB gene found in Shaziling pigs has polymorphism while the homology of SLA-DRA gene is up to 100 % .

  16. Bioinformatics analysis and expression of a novel protein ROP48 in Toxoplasma gondii.

    PubMed

    Zhou, Jian; Wang, Lin; Zhou, Aihua; Lu, Gang; Li, Qihang; Wang, Zhilin; Zhu, Meiyan; Zhou, Huaiyu; Cong, Hua; He, Shenyi

    2016-03-01

    Toxoplasma gondii is an obligate intracellular apicomplexan parasite, and can infect warmblooded animals and humans all over the world. In the past years, ROP family genes encoding particular proteins of T. gondii had made a great contribution to toxoplasmosis. In this study, we used multiple bioinformatics approaches to predict the physical and chemical characteristics, transmembrane domain, epitope, and topological structure of the rhoptry protein 48 (ROP48). The results indicated that ROP48 protein was mainly located in the membrane and had several positive linear-B cell epitopes and Th-cell epitopes, which suggested that ROP48 is a potential DNA vaccine candidate against toxoplasmosis. Then the PCR product amplified from the ROP48 cDNA was inserted into a pEASY-T1 vector to build a recombinant cloning plasmid. After sequencing, ROP48 was subcloned into a eukaryotic expression plasmid pEGFP-C1 to obtain pEGFP-C1-ROP48 (pROP48). After identification by PCR and restriction enzyme digestion, the recombinant plasmid pROP48 was transfected into HEK 293-T cell and identified by RT-PCR. The results showed that the eukaryotic expression plasmid pROP48 was constructed and transfected to the cells of HEK 293-T successfully. Western blotting showed that the expressed proteins can be recognized by anti-STAg mouse sera. PMID:27078655

  17. Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

    PubMed

    Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

    2014-09-10

    Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals. PMID:25014137

  18. Combined expressional analysis, bioinformatics and targeted proteomics identify new potential therapeutic targets in glioblastoma stem cells

    PubMed Central

    Stangeland, Biljana; Mughal, Awais A.; Grieg, Zanina; Sandberg, Cecilie Jonsgar; Joel, Mrinal; Nygård, Ståle; Meling, Torstein; Murrell, Wayne; Vik Mo, Einar O.; Langmoen, Iver A.

    2015-01-01

    Glioblastoma (GBM) is both the most common and the most lethal primary brain tumor. It is thought that GBM stem cells (GSCs) are critically important in resistance to therapy. Therefore, there is a strong rationale to target these cells in order to develop new molecular therapies. To identify molecular targets in GSCs, we compared gene expression in GSCs to that in neural stem cells (NSCs) from the adult human brain, using microarrays. Bioinformatic filtering identified 20 genes (PBK/TOPK, CENPA, KIF15, DEPDC1, CDC6, DLG7/DLGAP5/HURP, KIF18A, EZH2, HMMR/RHAMM/CD168, NOL4, MPP6, MDM1, RAPGEF4, RHBDD1, FNDC3B, FILIP1L, MCC, ATXN7L4/ATXN7L1, P2RY5/LPAR6 and FAM118A) that were consistently expressed in GSC cultures and consistently not expressed in NSC cultures. The expression of these genes was confirmed in clinical samples (TCGA and REMBRANDT). The first nine genes were highly co-expressed in all GBM subtypes and were part of the same protein-protein interaction network. Furthermore, their combined up-regulation correlated negatively with patient survival in the mesenchymal GBM subtype. Using targeted proteomics and the COGNOSCENTE database we linked these genes to GBM signalling pathways. Nine genes: PBK, CENPA, KIF15, DEPDC1, CDC6, DLG7, KIF18A, EZH2 and HMMR should be further explored as targets for treatment of GBM. PMID:26295306

  19. Bioinformatics of cardiovascular miRNA biology.

    PubMed

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'.

  20. Expression and Bioinformatics Analysis of Pectate Lyase Gene from Bacillus subtilis521

    NASA Astrophysics Data System (ADS)

    Xiao, Jing; Lu, Fu-Ping; Li, Yu; Li, Jin-Ting

    In order to exploit new genetic resources, Pectate lyase(PEL) gene was amplified by PCR using the genome DNA from an alkaline Bacillus subtilis521. The PCR product was inserted into pET22b(+) vector. The recombinant plasmids were cloned in E.coli DH5α and then expressed in E.coli BL21. When cultured in the optimized medium, the positive clones E.coli BL21(pET22b(+)pel)showed intracellular pectate lyase activity of 90.0 U/mL. It was indicated that we had obtained the correct PEL gene. The pel has an open reading frame of 1263 nucleotides and codes for a product of 420 amino acids with a calculated molecular mass of 45.5 kD. Based on computer assisted analysis, a signal peptides and two conserved domains were revealed. The sequence analysis for PEL showed that it shares 26-82% homology with other strains in GenBank. In addition, the advanced structure of PEL were also predicted and analysed. This study will help to the experimental design of PEL fermentation and production purification and enzyme evolution.

  1. Proteomic and bioinformatic analysis of mammalian SWI/SNF complexes identifies extensive roles in human malignancy.

    PubMed

    Kadoch, Cigall; Hargreaves, Diana C; Hodges, Courtney; Elias, Laura; Ho, Lena; Ranish, Jeff; Crabtree, Gerald R

    2013-06-01

    Subunits of mammalian SWI/SNF (mSWI/SNF or BAF) complexes have recently been implicated as tumor suppressors in human malignancies. To understand the full extent of their involvement, we conducted a proteomic analysis of endogenous mSWI/SNF complexes, which identified several new dedicated, stable subunits not found in yeast SWI/SNF complexes, including BCL7A, BCL7B and BCL7C, BCL11A and BCL11B, BRD9 and SS18. Incorporating these new members, we determined mSWI/SNF subunit mutation frequency in exome and whole-genome sequencing studies of primary human tumors. Notably, mSWI/SNF subunits are mutated in 19.6% of all human tumors reported in 44 studies. Our analysis suggests that specific subunits protect against cancer in specific tissues. In addition, mutations affecting more than one subunit, defined here as compound heterozygosity, are prevalent in certain cancers. Our studies demonstrate that mSWI/SNF is the most frequently mutated chromatin-regulatory complex (CRC) in human cancer, exhibiting a broad mutation pattern, similar to that of TP53. Thus, proper functioning of polymorphic BAF complexes may constitute a major mechanism of tumor suppression.

  2. Rapid cloning and bioinformatic analysis of spinach Y chromosome-specific EST sequences.

    PubMed

    Deng, Chuan-Liang; Zhang, Wei-Li; Cao, Ying; Wang, Shao-Jing; Li, Shu-Fen; Gao, Wu-Jun; Lu, Long-Dou

    2015-12-01

    The genome of spinach single chromosome complement is about 1000 Mbp, which is the model material to study the molecular mechanisms of plant sex differentiation. The cytological study showed that the biggest spinach chromosome (chromosome 1) was taken as spinach sex chromosome. It had three alleles of sex-related X,X(m) and Y. Many researchers have been trying to clone the sex-determining genes and investigated the molecular mechanism of spinach sex differentiation. However,there are no successful cloned reports about these genes. A new technology combining chromosome microdissection with hybridization-specific amplification (HSA) was adopted. The spinach Y chromosome degenerate oligonucleotide primed-PCR (DOP-PCR) products were hybridized with cDNA of the male spinach flowers in florescence. The female spinach genome was taken as blocker and cDNA library specifically expressed in Y chromosome was constructed. Moreover, expressed sequence tag (EST) sequences in cDNA library were cloned, sequenced and bioinformatics was analysed. There were 63 valid EST sequences obtained in this study. The fragment size was between 53 and 486 bp. BLASTn homologous alignment indicated that 12 EST sequences had homologous sequences of nucleic acids, the rest were new sequences. BLASTx homologous alignment indicated that 16 EST sequences had homologous protein-encoding nucleic acid sequence. The spinach Y chromosome-specific EST sequences laid the foundation for cloning the functional genes, specifically expressed in spinach Y chromosome. Meanwhile, the establishment of the technology system in the research provided a reference for rapid cloning of other biological sex chromosome-specific EST sequences.

  3. Critical genes in head and neck squamous cell carcinoma revealed by bioinformatic analysis of gene expression data.

    PubMed

    Wang, B; Wang, T; Cao, X L; Li, Y

    2015-12-21

    In this study, bioinformatic analysis of gene expression data of head and neck squamous cell carcinoma (HNSCC) was performed to identify critical genes. Gene expression data of HNSCC were downloaded from the Cancer Genome Atlas (TCGA) and differentially expressed genes were determined through significance analysis of microarrays. Protein-protein interaction networks were constructed and used to identify hub genes. Functional enrichment analysis was performed with DAVID. Relevant microRNAs, transcription factors, and small molecule drugs were predicted by the Fisher exact test. Survival analysis was performed with the Kaplan-Meier plot from a package for survival analysis in R. In the five groups of HNSCC patients, a total of 5946 DEGs were identified in group 1, 4575 DEGs in group 2, 5580 DEGs in group 3, 8017 DEGs in group 4, and 5469 DEGs in group 5. DEGs in the cell cycle and immune response were significantly over-represented. Five PPI networks were constructed from which hub genes were acquired, such as minichromosome maintenance complex component 7 (MCM7), MCM2, decorin (DCN), retinoblastoma 1 (RB1), and tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein gamma (YWHAG). No significant difference in survival was observed among the 5 groups; however, a significant difference existed between two combined groups (groups 1, 3, and 5 vs groups 2 and 4). Our study revealed critical genes in HNSCC, which could supplement the knowledge about the pathogenesis of HNSCC and provide clues for future therapy development.

  4. Critical genes in head and neck squamous cell carcinoma revealed by bioinformatic analysis of gene expression data.

    PubMed

    Wang, B; Wang, T; Cao, X L; Li, Y

    2015-01-01

    In this study, bioinformatic analysis of gene expression data of head and neck squamous cell carcinoma (HNSCC) was performed to identify critical genes. Gene expression data of HNSCC were downloaded from the Cancer Genome Atlas (TCGA) and differentially expressed genes were determined through significance analysis of microarrays. Protein-protein interaction networks were constructed and used to identify hub genes. Functional enrichment analysis was performed with DAVID. Relevant microRNAs, transcription factors, and small molecule drugs were predicted by the Fisher exact test. Survival analysis was performed with the Kaplan-Meier plot from a package for survival analysis in R. In the five groups of HNSCC patients, a total of 5946 DEGs were identified in group 1, 4575 DEGs in group 2, 5580 DEGs in group 3, 8017 DEGs in group 4, and 5469 DEGs in group 5. DEGs in the cell cycle and immune response were significantly over-represented. Five PPI networks were constructed from which hub genes were acquired, such as minichromosome maintenance complex component 7 (MCM7), MCM2, decorin (DCN), retinoblastoma 1 (RB1), and tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein gamma (YWHAG). No significant difference in survival was observed among the 5 groups; however, a significant difference existed between two combined groups (groups 1, 3, and 5 vs groups 2 and 4). Our study revealed critical genes in HNSCC, which could supplement the knowledge about the pathogenesis of HNSCC and provide clues for future therapy development. PMID:26782382

  5. Diagnosis of an imprinted-gene syndrome by a novel bioinformatics analysis of whole-genome sequences from a family trio.

    PubMed

    Bodian, Dale L; Solomon, Benjamin D; Khromykh, Alina; Thach, Dzung C; Iyer, Ramaswamy K; Link, Kathleen; Baker, Robin L; Baveja, Rajiv; Vockley, Joseph G; Niederhuber, John E

    2014-11-01

    Whole-genome sequencing and whole-exome sequencing are becoming more widely applied in clinical medicine to help diagnose rare genetic diseases. Identification of the underlying causative mutations by genome-wide sequencing is greatly facilitated by concurrent analysis of multiple family members, most often the mother-father-proband trio, using bioinformatics pipelines that filter genetic variants by mode of inheritance. However, current pipelines are limited to Mendelian inheritance patterns and do not specifically address disorders caused by mutations in imprinted genes, such as forms of Angelman syndrome and Beckwith-Wiedemann syndrome. Using publicly available tools, we implemented a genetic inheritance search mode to identify imprinted-gene mutations. Application of this search mode to whole-genome sequences from a family trio led to a diagnosis for a proband for whom extensive clinical testing and Mendelian inheritance-based sequence analysis were nondiagnostic. The condition in this patient, IMAGe syndrome, is likely caused by the heterozygous mutation c.832A>G (p.Lys278Glu) in the imprinted gene CDKN1C. The genotypes and disease status of six members of the family are consistent with maternal expression of the gene, and allele-biased expression was confirmed by RNA-Seq for the heterozygotes. This analysis demonstrates that an imprinted-gene search mode is a valuable addition to genome sequence analysis pipelines for identifying disease-causative variants. PMID:25614875

  6. Diagnosis of an imprinted-gene syndrome by a novel bioinformatics analysis of whole-genome sequences from a family trio

    PubMed Central

    Bodian, Dale L; Solomon, Benjamin D; Khromykh, Alina; Thach, Dzung C; Iyer, Ramaswamy K; Link, Kathleen; Baker, Robin L; Baveja, Rajiv; Vockley, Joseph G; Niederhuber, John E

    2014-01-01

    Whole-genome sequencing and whole-exome sequencing are becoming more widely applied in clinical medicine to help diagnose rare genetic diseases. Identification of the underlying causative mutations by genome-wide sequencing is greatly facilitated by concurrent analysis of multiple family members, most often the mother–father–proband trio, using bioinformatics pipelines that filter genetic variants by mode of inheritance. However, current pipelines are limited to Mendelian inheritance patterns and do not specifically address disorders caused by mutations in imprinted genes, such as forms of Angelman syndrome and Beckwith–Wiedemann syndrome. Using publicly available tools, we implemented a genetic inheritance search mode to identify imprinted-gene mutations. Application of this search mode to whole-genome sequences from a family trio led to a diagnosis for a proband for whom extensive clinical testing and Mendelian inheritance-based sequence analysis were nondiagnostic. The condition in this patient, IMAGe syndrome, is likely caused by the heterozygous mutation c.832A>G (p.Lys278Glu) in the imprinted gene CDKN1C. The genotypes and disease status of six members of the family are consistent with maternal expression of the gene, and allele-biased expression was confirmed by RNA-Seq for the heterozygotes. This analysis demonstrates that an imprinted-gene search mode is a valuable addition to genome sequence analysis pipelines for identifying disease-causative variants. PMID:25614875

  7. Taking Bioinformatics to Systems Medicine.

    PubMed

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  8. Identification of drug candidate for osteoporosis by computational bioinformatics analysis of gene expression profile

    PubMed Central

    2013-01-01

    Background Osteoporosis is a condition of bones that leads to an increased susceptibility to fracture and consequent painful morbidity. It has become a major issue of life quality worldwide. However, until now, the molecular mechanism of this disease is far from being clear. Methods In this study, we obtained the gene expression profile of osteoporosis and controls from Gene Expression Omnibus and identified differentially expressed genes (DEGs) using classical t-test method. Then, functional enrichment analyses were performed to identify the dysregulated Gene Ontology categories and dysfunctional pathways in osteoporosis patients compared to controls. Besides, the connectivity map was used to identify compounds that induced inverse gene changes to osteoporosis. Results A total of 5581 DEGs were identified. We found these DEGs were enriched in 9 pathways by pathway enrichment analysis, including focal adhesion and MAPK signaling pathway. Besides, sanguinarine was identified as a potential therapeutic drug candidate capable of targeting osteoporosis. Conclusion Although candidate agents identified by our approach may be premature for clinical trials, it is clearly a direction that warrants additional consideration. PMID:23448234

  9. Bioinformatic analysis of miRNA expression patterns in TFF2 knock-out mice.

    PubMed

    Yin, Y; Shan, H Q; Huang, W; Wu, Y M; Lu, H; Jin, Y

    2014-10-20

    Trefoil factors, which bear a unique 3-loop trefoil domain, are a family of small secretory protease-resistant peptides (7-12 kDa) discovered in the 1980s. Trefoil factor 2 (TFF2) is a unique member of trefoil factors family that plays important roles in gastrointestinal mucosal defense and repair. However, few studies have characterized the miRNA expression patterns in TFF2 knock-out mice. In this study, we investigated the regulatory role of miRNAs in TFF2 knock-out mice. Whole miRNome profiling for TFF2 knock-out mice and wild-type mice were downloaded from the Gene Expression Omnibus database. A total of 14 differentially expressed miRNAs were identified using the limma package. Target genes for 2 differentially expressed miRNAs were retrieved from 2 databases. After mapping these target genes into STRING, an interaction network was constructed. Gene Ontology analysis suggested that the differentially expressed miRNAs are involved in cyclic AMP metabolism and the growth process. Additionally, dysregulated miRNAs target pathways of transforming growth factor-beta signaling pathway and cytokine-cytokine receptor interaction. Our results suggest that miRNAs may play important regulatory roles in processes involving TFF2, particularly in the regulation of signal transduction pathways. However, further validation of our results is needed.

  10. A Bioinformatic Strategy for the Detection, Classification and Analysis of Bacterial Autotransporters

    PubMed Central

    Celik, Nermin; Webb, Chaille T.; Leyton, Denisse L.; Holt, Kathryn E.; Heinz, Eva; Gorrell, Rebecca; Kwok, Terry; Naderer, Thomas; Strugnell, Richard A.; Speed, Terence P.; Teasdale, Rohan D.; Likić, Vladimir A.; Lithgow, Trevor

    2012-01-01

    Autotransporters are secreted proteins that are assembled into the outer membrane of bacterial cells. The passenger domains of autotransporters are crucial for bacterial pathogenesis, with some remaining attached to the bacterial surface while others are released by proteolysis. An enigma remains as to whether autotransporters should be considered a class of secretion system, or simply a class of substrate with peculiar requirements for their secretion. We sought to establish a sensitive search protocol that could identify and characterize diverse autotransporters from bacterial genome sequence data. The new sequence analysis pipeline identified more than 1500 autotransporter sequences from diverse bacteria, including numerous species of Chlamydiales and Fusobacteria as well as all classes of Proteobacteria. Interrogation of the proteins revealed that there are numerous classes of passenger domains beyond the known proteases, adhesins and esterases. In addition the barrel-domain-a characteristic feature of autotransporters-was found to be composed from seven conserved sequence segments that can be arranged in multiple ways in the tertiary structure of the assembled autotransporter. One of these conserved motifs overlays the targeting information required for autotransporters to reach the outer membrane. Another conserved and diagnostic motif maps to the linker region between the passenger domain and barrel-domain, indicating it as an important feature in the assembly of autotransporters. PMID:22905239

  11. Construction of a public CHO cell line transcript database using versatile bioinformatics analysis pipelines.

    PubMed

    Rupp, Oliver; Becker, Jennifer; Brinkrolf, Karina; Timmermann, Christina; Borth, Nicole; Pühler, Alfred; Noll, Thomas; Goesmann, Alexander

    2014-01-01

    Chinese hamster ovary (CHO) cell lines represent the most commonly used mammalian expression system for the production of therapeutic proteins. In this context, detailed knowledge of the CHO cell transcriptome might help to improve biotechnological processes conducted by specific cell lines. Nevertheless, very few assembled cDNA sequences of CHO cells were publicly released until recently, which puts a severe limitation on biotechnological research. Two extended annotation systems and web-based tools, one for browsing eukaryotic genomes (GenDBE) and one for viewing eukaryotic transcriptomes (SAMS), were established as the first step towards a publicly usable CHO cell genome/transcriptome analysis platform. This is complemented by the development of a new strategy to assemble the ca. 100 million reads, sequenced from a broad range of diverse transcripts, to a high quality CHO cell transcript set. The cDNA libraries were constructed from different CHO cell lines grown under various culture conditions and sequenced using Roche/454 and Illumina sequencing technologies in addition to sequencing reads from a previous study. Two pipelines to extend and improve the CHO cell line transcripts were established. First, de novo assemblies were carried out with the Trinity and Oases assemblers, using varying k-mer sizes. The resulting contigs were screened for potential CDS using ESTScan. Redundant contigs were filtered out using cd-hit-est. The remaining CDS contigs were re-assembled with CAP3. Second, a reference-based assembly with the TopHat/Cufflinks pipeline was performed, using the recently published draft genome sequence of CHO-K1 as reference. Additionally, the de novo contigs were mapped to the reference genome using GMAP and merged with the Cufflinks assembly using the cuffmerge software. With this approach 28,874 transcripts located on 16,492 gene loci could be assembled. Combining the results of both approaches, 65,561 transcripts were identified for CHO cell lines

  12. Bioinformatics investigation of therapeutic mechanisms of Xuesaitong capsule treating ischemic cerebrovascular rat model with comparative transcriptome analysis

    PubMed Central

    Liao, Jiangquan; Wei, Benjun; Chen, Hengwen; Liu, Yongmei; Wang, Jie

    2016-01-01

    Background: Xuesaitong soft capsule (XST) which consists of panax notoginseng saponin (PNS) has been used to treat ischemic cerebrovascular diseases in China. The therapeutic mechanism of XST has not been elucidated yet from prospective of genomics and bioinformatics. Methods: A transcriptome analysis was performed to review series concerning middle cerebral artery occlusion (MCAO) rat model and XST intervention after MCAO from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were compared between blank group and model group, model group and XST group. Functional enrichment and pathway analysis were performed. Protein-Protein interaction network was constructed. The overlapping genes from two DEGs sets were screened out and profound analysis was performed. Results: Two series including 22 samples were obtained. 870 DEGs were identified between blank group and model group, and 1189 DEGs were identified between model group and XST group. GO terms and KEGG pathways of MCAO and XST intervention were significantly enriched. PPI networks were constructed to demonstrate the gene-gene interactions. The overlapping genes from two DEGs sets were highlighted. ANTXR2, FHL3, PRCP, TYROBP, TAF9B, FGFR2, BCL11B, RB1CC1 and MBNL2 were the pivotal genes and possible action sites of XST therapeutic mechanisms. Conclusion: MCAO is a pathological process with multiple. PMID:27347353

  13. Screening of gene signatures for rheumatoid arthritis and osteoarthritis based on bioinformatics analysis

    PubMed Central

    He, Peiheng; Zhang, Ziji; Liao, Weiming; Xu, Dongliang; Fu, Ming; Kang, Yan

    2016-01-01

    The current study aimed to identify gene signatures during rheumatoid arthritis (RA) and osteoarthritis (OA), and used these to elucidate the underlying modular mechanisms. Using the Gene Expression Omnibus database, the present study obtained the GSE7669 mRNA expression microarray data from RA and OA synovial fibroblasts (n=6 each). The differentially expressed genes (DEGs) in RA synovial samples compared with OA samples were identified using the Linear Models for Microarray Analysis package. The Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses were performed using the Database for Annotation Visualization and Integrated Discovery. A protein-protein interaction network was constructed and the modules were further analyzed using the Molecular Complex Detection plugin of Cytoscape. A total of 181 DEGs were identified by comparing RA and OA synovial samples (96 up- and 85 downregulated genes). The significant DEGs in module 1, including collagen, type I, α 1 (COL1A1), COL3A1, COL4A1 and COL11A1, were predominantly enriched in the extracellular matrix (ECM)-receptor interaction and focal adhesion pathways. Additionally, significant DEGs in module 2, including radical S-adenosyl methionine domain containing 2 (RSAD2), 2′-5′-oligoadenylate synthetase 2 (OAS2), myxovirus (influenza virus) resistance 1 (MX1) and ISG15 ubiquitin-like modifier (ISG15), were predominantly associated with immune function pathways. In conclusion, the present study indicated that RSAD2, OAS2, MX1 and ISG15 may be notable gene signatures in RA development via regulation of the immune response. COL3A1, COL4A1, COL1A1 and COL11A1 may be important gene signatures in OA development via involvement in the pathways of ECM-receptor interactions and focal adhesions. PMID:27356888

  14. Screening of gene signatures for rheumatoid arthritis and osteoarthritis based on bioinformatics analysis.

    PubMed

    He, Peiheng; Zhang, Ziji; Liao, Weiming; Xu, Dongliang; Fu, Ming; Kang, Yan

    2016-08-01

    The current study aimed to identify gene signatures during rheumatoid arthritis (RA) and osteoarthritis (OA), and used these to elucidate the underlying modular mechanisms. Using the Gene Expression Omnibus database, the present study obtained the GSE7669 mRNA expression microarray data from RA and OA synovial fibroblasts (n=6 each). The differentially expressed genes (DEGs) in RA synovial samples compared with OA samples were identified using the Linear Models for Microarray Analysis package. The Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses were performed using the Database for Annotation Visualization and Integrated Discovery. A protein‑protein interaction network was constructed and the modules were further analyzed using the Molecular Complex Detection plugin of Cytoscape. A total of 181 DEGs were identified by comparing RA and OA synovial samples (96 up‑ and 85 downregulated genes). The significant DEGs in module 1, including collagen, type I, α 1 (COL1A1), COL3A1, COL4A1 and COL11A1, were predominantly enriched in the extracellular matrix (ECM)‑receptor interaction and focal adhesion pathways. Additionally, significant DEGs in module 2, including radical S‑adenosyl methionine domain containing 2 (RSAD2), 2'‑5'‑oligoadenylate synthetase 2 (OAS2), myxovirus (influenza virus) resistance 1 (MX1) and ISG15 ubiquitin‑like modifier (ISG15), were predominantly associated with immune function pathways. In conclusion, the present study indicated that RSAD2, OAS2, MX1 and ISG15 may be notable gene signatures in RA development via regulation of the immune response. COL3A1, COL4A1, COL1A1 and COL11A1 may be important gene signatures in OA development via involvement in the pathways of ECM-receptor interactions and focal adhesions. PMID:27356888

  15. Bioinformatics analysis of transcriptional regulation of circadian genes in rat liver

    PubMed Central

    2014-01-01

    Background The circadian clock is a critical regulator of biological functions controlling behavioral, physiological and biochemical processes. Because the liver is the primary regulator of metabolites within the mammalian body and the disruption of circadian rhythms in liver is associated with severe illness, circadian regulators would play a strong role in maintaining liver function. However, the regulatory structure that governs circadian dynamics within the liver at a transcriptional level remains unknown. To explore this aspect, we analyzed hepatic transcriptional dynamics in Sprague-Dawley rats over a period of 24 hours to assess the genome-wide responses. Results Using an unsupervised consensus clustering method, we identified four major gene expression clusters, corresponding to central carbon and nitrogen metabolism, membrane integrity, immune function, and DNA repair, all of which have dynamics which suggest regulation in a circadian manner. With the assumption that transcription factors (TFs) that are differentially expressed and contain CLOCK:BMAL1 binding sites on their proximal promoters are likely to be clock-controlled TFs, we were able to use promoter analysis to putatively identify additional clock-controlled TFs besides PARF and RORA families. These TFs are both functionally and temporally related to the clusters they regulate. Furthermore, we also identified significant sets of clock TFs that are potentially transcriptional regulators of gene clusters. Conclusions All together, we were able to propose a regulatory structure for circadian regulation which represents alternative paths for circadian control of different functions within the liver. Our prediction has been affirmed by functional and temporal analyses which are able to extend for similar studies. PMID:24666587

  16. Coupling in silico and in vitro analysis of peptide-MHC binding: a bioinformatic approach enabling prediction of superbinding peptides and anchorless epitopes.

    PubMed

    Doytchinova, Irini A; Walshe, Valerie A; Jones, Nicola A; Gloster, Simone E; Borrow, Persephone; Flower, Darren R

    2004-06-15

    The ability to define and manipulate the interaction of peptides with MHC molecules has immense immunological utility, with applications in epitope identification, vaccine design, and immunomodulation. However, the methods currently available for prediction of peptide-MHC binding are far from ideal. We recently described the application of a bioinformatic prediction method based on quantitative structure-affinity relationship methods to peptide-MHC binding. In this study we demonstrate the predictivity and utility of this approach. We determined the binding affinities of a set of 90 nonamer peptides for the MHC class I allele HLA-A*0201 using an in-house, FACS-based, MHC stabilization assay, and from these data we derived an additive quantitative structure-affinity relationship model for peptide interaction with the HLA-A*0201 molecule. Using this model we then designed a series of high affinity HLA-A2-binding peptides. Experimental analysis revealed that all these peptides showed high binding affinities to the HLA-A*0201 molecule, significantly higher than the highest previously recorded. In addition, by the use of systematic substitution at principal anchor positions 2 and 9, we showed that high binding peptides are tolerant to a wide range of nonpreferred amino acids. Our results support a model in which the affinity of peptide binding to MHC is determined by the interactions of amino acids at multiple positions with the MHC molecule and may be enhanced by enthalpic cooperativity between these component interactions.

  17. Coupling in silico and in vitro analysis of peptide-MHC binding: a bioinformatic approach enabling prediction of superbinding peptides and anchorless epitopes.

    PubMed

    Doytchinova, Irini A; Walshe, Valerie A; Jones, Nicola A; Gloster, Simone E; Borrow, Persephone; Flower, Darren R

    2004-06-15

    The ability to define and manipulate the interaction of peptides with MHC molecules has immense immunological utility, with applications in epitope identification, vaccine design, and immunomodulation. However, the methods currently available for prediction of peptide-MHC binding are far from ideal. We recently described the application of a bioinformatic prediction method based on quantitative structure-affinity relationship methods to peptide-MHC binding. In this study we demonstrate the predictivity and utility of this approach. We determined the binding affinities of a set of 90 nonamer peptides for the MHC class I allele HLA-A*0201 using an in-house, FACS-based, MHC stabilization assay, and from these data we derived an additive quantitative structure-affinity relationship model for peptide interaction with the HLA-A*0201 molecule. Using this model we then designed a series of high affinity HLA-A2-binding peptides. Experimental analysis revealed that all these peptides showed high binding affinities to the HLA-A*0201 molecule, significantly higher than the highest previously recorded. In addition, by the use of systematic substitution at principal anchor positions 2 and 9, we showed that high binding peptides are tolerant to a wide range of nonpreferred amino acids. Our results support a model in which the affinity of peptide binding to MHC is determined by the interactions of amino acids at multiple positions with the MHC molecule and may be enhanced by enthalpic cooperativity between these component interactions. PMID:15187128

  18. Identification of Differentially Expressed Genes in Kawasaki Disease Patients as Potential Biomarkers for IVIG Sensitivity by Bioinformatics Analysis.

    PubMed

    He, Lan; Sheng, Youyu; Huang, Chunyun; Huang, Guoying

    2016-08-01

    Kawasaki disease (KD) is a leading cause of acquired heart disease predominantly affecting infants and young children. Intravenous immunoglobulin (IVIG) is applied as the most favorable treatment against KD, but IVIG resistant remains exist. Although several clinical scoring systems have been developed to identify children at highest risk of IVIG resistance, there is a need to identify sufficiently sensitive biomarkers for IVIG treatment. Some differentially expressed genes (DEGs) could be the promising potential biomarkers for IVIG-related sensitivity diagnosis. We employed a systematic and integrative bioinformatics framework to identify such kind of genes. The performance of the candidate genes was evaluated by hierarchical clustering, ROC analysis and literature mining. By analyzing three datasets of KD patients, 34 DEGs of the three groups have been found to be associated with IVIG-related sensitivity. A module of 12 genes could predict resistant group patients with high accuracy, and a module of ten genes could predict responsive group patients effectively with accuracy of 96 %. And three of them are most likely to serve as drug targets or diagnostic biomarkers in the future. Compared with unsupervised hierarchical clustering analysis, our modules could distinct IVIG-resistant patients efficiently. Two groups of DEGs could predict IVIG-related sensitivity with high accuracy, which are potential biomarkers for the clinical diagnosis and prediction of IVIG treatment response in KD patients, improving the prognosis of patients.

  19. Identification and Characterization of miRNAs in Chondrus crispus by High-Throughput Sequencing and Bioinformatics Analysis

    PubMed Central

    Gao, Fan; Nan, FangRu; Song, Wei; Feng, Jia; Lv, JunPing; Xie, ShuLian

    2016-01-01

    Chondrus crispus, an economically and medicinally important red alga, is a medicinally active substance and important for anti-tumor research. In this study, 117 C. crispus miRNAs (108 conserved and 9 novel) were identified from 2,416,181 small-RNA reads using high-throughput sequencing and bioinformatics methods. According to the BLAST search against the miRBase database, these miRNAs belonged to 110 miRNA families. Sequence alignment combined with homology searching revealed both the conservation and diversity of predicted potential miRNA families in different plant species. Four and 19 randomly selected miRNAs were validated by northern blotting and stem-loop quantitative real-time reverse transcription polymerase chain reaction detection, respectively. The validation rates (75% and 94.7%) demonstrated that most of the identified miRNAs could be credible. A total of 160 potential target genes were predicted and functionally annotated by Gene Ontology analysis and Kyoto Encyclopedia of Genes and Genomes analysis. We also analyzed the interrelationship of miRNAs, miRNA-target genes and target genes in C. crispus by constructing a Cytoscape network. The 117 miRNAs identified in our study should supply large quantities of information that will be important for red algae small RNA research. PMID:27193824

  20. Identification and Characterization of miRNAs in Chondrus crispus by High-Throughput Sequencing and Bioinformatics Analysis.

    PubMed

    Gao, Fan; Nan, FangRu; Song, Wei; Feng, Jia; Lv, JunPing; Xie, ShuLian

    2016-01-01

    Chondrus crispus, an economically and medicinally important red alga, is a medicinally active substance and important for anti-tumor research. In this study, 117 C. crispus miRNAs (108 conserved and 9 novel) were identified from 2,416,181 small-RNA reads using high-throughput sequencing and bioinformatics methods. According to the BLAST search against the miRBase database, these miRNAs belonged to 110 miRNA families. Sequence alignment combined with homology searching revealed both the conservation and diversity of predicted potential miRNA families in different plant species. Four and 19 randomly selected miRNAs were validated by northern blotting and stem-loop quantitative real-time reverse transcription polymerase chain reaction detection, respectively. The validation rates (75% and 94.7%) demonstrated that most of the identified miRNAs could be credible. A total of 160 potential target genes were predicted and functionally annotated by Gene Ontology analysis and Kyoto Encyclopedia of Genes and Genomes analysis. We also analyzed the interrelationship of miRNAs, miRNA-target genes and target genes in C. crispus by constructing a Cytoscape network. The 117 miRNAs identified in our study should supply large quantities of information that will be important for red algae small RNA research. PMID:27193824

  1. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  2. Bringing Web 2.0 to bioinformatics.

    PubMed

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  3. Bioinformatics-Aided Venomics

    PubMed Central

    Kaas, Quentin; Craik, David J.

    2015-01-01

    Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future. PMID:26110505

  4. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  5. Bioinformatics and the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  6. Identification of potential therapeutic target genes and mechanisms in head and neck squamous cell carcinoma by bioinformatics analysis

    PubMed Central

    KUANG, JING; ZHAO, MEI; LI, HUILIAN; DANG, WEI; LI, WEI

    2016-01-01

    The present study aimed to identify the potential target genes and underlying molecular mechanisms involved in head and neck squamous cell carcinoma (HNSCC) by bioinformatics analysis. Microarray data of a Gene Expression Omnibus series GSE6631 was downloaded from the Gene Expression Omnibus database, which was generated from paired samples of HNSCC and normal tissue from 22 patients, and was used to identify differentially expressed genes (DEGs). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes enrichment analyses were performed to investigate the functions of the identified DEGs. Furthermore, the protein-protein interaction (PPI) network of these DEGs was constructed using Cytoscape software. Between HNSCC and normal samples there was a difference in 419 DEGs, including 196 upregulated and 223 downregulated genes. The upregulated DEGs were mainly enriched in GO terms of cell adhesion, extracellular matrix (ECM) organization and collagen metabolic process, while the downregulated DEGs were mainly associated with epidermis development and epidermal cell differentiation. The DEGs were enriched in pathways such as ECM-receptor interaction, focal adhesion and drug metabolism. Fibronectin 1 (FN1), epidermal growth factor receptor (EGFR), collagen type I alpha 1 (COL1A1) and matrix metallopeptidase-9 (MMP-9) were hub nodes in the PPI network. These results suggested that cell adhesion and drug metabolism may be associated with HNSCC development, and genes such as FN1, EGFR, COL4A1 and MMP-9 may be potential therapeutic target genes in HNSCC. PMID:27123054

  7. A bioinformatics analysis of Lamin-A regulatory network: a perspective on epigenetic involvement in Hutchinson-Gilford progeria syndrome.

    PubMed

    Arancio, Walter

    2012-04-01

    Hutchinson-Gilford progeria syndrome (HGPS) is a rare human genetic disease that leads to premature aging. HGPS is caused by mutation in the Lamin-A (LMNA) gene that leads, in affected young individuals, to the accumulation of the progerin protein, usually present only in aging differentiated cells. Bioinformatics analyses of the network of interactions of the LMNA gene and transcripts are presented. The LMNA gene network has been analyzed using the BioGRID database (http://thebiogrid.org/) and related analysis tools such as Osprey (http://biodata.mshri.on.ca/osprey/servlet/Index) and GeneMANIA ( http://genemania.org/). The network of interaction of LMNA transcripts has been further analyzed following the competing endogenous (ceRNA) hypotheses (RNA cross-talk via microRNAs [miRNAs]) and using the miRWalk database and tools (www.ma.uni-heidelberg.de/apps/zmf/mirwalk/). These analyses suggest particular relevance of epigenetic modifiers (via acetylase complexes and specifically HTATIP histone acetylase) and adenosine triphosphate (ATP)-dependent chromatin remodelers (via pBAF, BAF, and SWI/SNF complexes).

  8. Ossification of the posterior longitudinal ligament related genes identification using microarray gene expression profiling and bioinformatics analysis.

    PubMed

    He, Hailong; Mao, Lingzhou; Xu, Peng; Xi, Yanhai; Xu, Ning; Xue, Mingtao; Yu, Jiangming; Ye, Xiaojian

    2014-01-10

    Ossification of the posterior longitudinal ligament (OPLL) is a kind of disease with physical barriers and neurological disorders. The objective of this study was to explore the differentially expressed genes (DEGs) in OPLL patient ligament cells and identify the target sites for the prevention and treatment of OPLL in clinic. Gene expression data GSE5464 was downloaded from Gene Expression Omnibus; then DEGs were screened by limma package in R language, and changed functions and pathways of OPLL cells compared to normal cells were identified by DAVID (The Database for Annotation, Visualization and Integrated Discovery); finally, an interaction network of DEGs was constructed by string. A total of 1536 DEGs were screened, with 31 down-regulated and 1505 up-regulated genes. Response to wounding function and Toll-like receptor signaling pathway may involve in the development of OPLL. Genes, such as PDGFB, PRDX2 may involve in OPLL through response to wounding function. Toll-like receptor signaling pathway enriched genes such as TLR1, TLR5, and TLR7 may involve in spine cord injury in OPLL. PIK3R1 was the hub gene in the network of DEGs with the highest degree; INSR was one of the most closely related genes of it. OPLL related genes screened by microarray gene expression profiling and bioinformatics analysis may be helpful for elucidating the mechanism of OPLL.

  9. A bioinformatics insight to rhizobial globins: gene identification and mapping, polypeptide sequence and phenetic analysis, and protein modeling.

    PubMed

    Gesto-Borroto, Reinier; Sánchez-Sánchez, Miriam; Arredondo-Peter, Raúl

    2015-01-01

    Globins (Glbs) are proteins widely distributed in organisms. Three evolutionary families have been identified in Glbs: the M, S and T Glb families. The M Glbs include flavohemoglobins (fHbs) and single-domain Glbs (SDgbs); the S Glbs include globin-coupled sensors (GCSs), protoglobins and sensor single domain globins, and the T Glbs include truncated Glbs (tHbs). Structurally, the M and S Glbs exhibit 3/3-folding whereas the T Glbs exhibit 2/2-folding. Glbs are widespread in bacteria, including several rhizobial genomes. However, only few rhizobial Glbs have been characterized. Hence, we characterized Glbs from 62 rhizobial genomes using bioinformatics methods such as data mining in databases, sequence alignment, phenogram construction and protein modeling. Also, we analyzed soluble extracts from Bradyrhizobium japonicum USDA38 and USDA58 by (reduced + carbon monoxide (CO) minus reduced) differential spectroscopy. Database searching showed that only fhb, sdgb, gcs and thb genes exist in the rhizobia analyzed in this work. Promoter analysis revealed that apparently several rhizobial glb genes are not regulated by a -10 promoter but might be regulated by -35 and Fnr (fumarate-nitrate reduction regulator)-like promoters. Mapping analysis revealed that rhizobial fhbs and thbs are flanked by a variety of genes whereas several rhizobial sdgbs and gcss are flanked by genes coding for proteins involved in the metabolism of nitrates and nitrites and chemotaxis, respectively. Phenetic analysis showed that rhizobial Glbs segregate into the M, S and T Glb families, while structural analysis showed that predicted rhizobial SDgbs and fHbs and GCSs globin domain and tHbs fold into the 3/3- and 2/2-folding, respectively. Spectra from B. japonicum USDA38 and USDA58 soluble extracts exhibited peaks and troughs characteristic of bacterial and vertebrate Glbs thus indicating that putative Glbs are synthesized in B. japonicum USDA38 and USDA58. PMID:26594329

  10. Identification and characterization of microRNAs in Eucheuma denticulatum by high-throughput sequencing and bioinformatics analysis.

    PubMed

    Gao, Fan; Nan, Fangru; Feng, Jia; Lv, Junping; Liu, Qi; Xie, Shulian

    2016-01-01

    Eucheuma denticulatum, an economically and industrially important red alga, is a valuable marine resource. Although microRNAs (miRNAs) play an essential role in gene post-transcriptional regulation, no research has been conducted to identify and characterize miRNAs in E. denticulatum. In this study, we identified 134 miRNAs (133 conserved miRNAs and one novel miRNA) from 2,997,135 small-RNA reads by high-throughput sequencing combined with bioinformatics analysis. BLAST searching against miRBase uncovered 126 potential miRNA families. A conservation and diversity analysis of predicted miRNA families in different plant species was performed by comparative alignment and homology searching. A total of 4 and 13 randomly selected miRNAs were respectively validated by northern blotting and stem-loop reverse transcription PCR, thereby demonstrating the reliability of the miRNA sequencing data. Altogether, 871 potential target genes were predicted using psRobot and TargetFinder. Target genes classification and enrichment were conducted based on Gene Ontology analysis. The functions of target gene products and associated metabolic pathways were predicted by Kyoto Encyclopedia of Genes and Genomes pathway analysis. A Cytoscape network was constructed to explore the interrelationships of miRNAs, miRNA-target genes and target genes. A large number of miRNAs with diverse target genes will play important roles for further understanding some essential biological processes in E. denticulatum. The uncovered information can serve as an important reference for the protection and utilization of this unique red alga in the future. PMID:26717154

  11. Bioinformatics and Moonlighting Proteins.

    PubMed

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein-protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations - it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  12. Additional EIPC Study Analysis. Final Report

    SciTech Connect

    Hadley, Stanton W; Gotham, Douglas J.; Luciani, Ralph L.

    2014-12-01

    Between 2010 and 2012 the Eastern Interconnection Planning Collaborative (EIPC) conducted a major long-term resource and transmission study of the Eastern Interconnection (EI). With guidance from a Stakeholder Steering Committee (SSC) that included representatives from the Eastern Interconnection States Planning Council (EISPC) among others, the project was conducted in two phases. Phase 1 involved a long-term capacity expansion analysis that involved creation of eight major futures plus 72 sensitivities. Three scenarios were selected for more extensive transmission- focused evaluation in Phase 2. Five power flow analyses, nine production cost model runs (including six sensitivities), and three capital cost estimations were developed during this second phase. The results from Phase 1 and 2 provided a wealth of data that could be examined further to address energy-related questions. A list of 14 topics was developed for further analysis. This paper brings together the earlier interim reports of the first 13 topics plus one additional topic into a single final report.

  13. The 2011 Bioinformatics Links Directory update: more resources, tools and databases and features to empower the bioinformatics community.

    PubMed

    Brazas, Michelle D; Yim, David S; Yamada, Joseph T; Ouellette, B F Francis

    2011-07-01

    The Bioinformatics Links Directory continues its collaboration with Nucleic Acids Research to collaboratively publish and compile a freely accessible, online collection of tools, databases and resource materials for bioinformatics and molecular biology research. The July 2011 Web Server issue of Nucleic Acids Research adds an additional 78 web server tools and 14 updates to the directory at http://bioinformatics.ca/links_directory/.

  14. Bioinformatics analysis of differentially expressed pathways related to the metastatic characteristics of osteosarcoma

    PubMed Central

    Sun, Wei; Ma, Xiaojun; Shen, Jiakang; Yin, Fei; Wang, Chongren; Cai, Zhengdong

    2016-01-01

    In this study, gene expression data of osteosarcoma (OSA) were analyzed to identify metastasis-related biological pathways. Four gene expression data sets (GSE21257, GSE9508, GSE49003 and GSE66673) were downloaded from Gene Expression Omnibus (GEO). An analysis of differentially expressed genes (DEGs) was performed using the Significance Analysis of Microarray (SAM) method. Gene expression levels were converted into scores of pathways by the Functional Analysis of Individual Microarray Expression (FAIME) algorithm and the differentially expressed pathways (DEPs) were then disclosed by a t-test. The distinguishing and prediction ability of the DEPs for metastatic and non-metastatic OSA was further confirmed using the principal component analysis (PCA) method and 3 gene expression data sets (GSE9508, GSE49003 and GSE66673) based on the support vector machines (SVM) model. A total of 616 downregulated and 681 upregulated genes were identified in the data set, GSE21257. The DEGs could not be used to distinguish metastatic OSA from non-metastatic OSA, as shown by PCA. Thus, an analysis of DEPs was further performed, resulting in 14 DEPs, such as NRAS signaling, Toll-like receptor (TLR) signaling, matrix metalloproteinase (MMP) regulation of cytokines and tumor necrosis factor receptor-associated factor (TRAF)-mediated interferon regulatory factor 7 (IRF7) activation. Cluster analysis indicated that these pathways could be used to distinguish between metastatic OSA from non-metastatic OSA. The prediction accuracy was 91, 66.7 and 87.5% for the data sets, GSE9508, GSE49003 and GSE66673, respectively. The results of PCA further validated that the DEPs could be used to distinguish metastatic OSA from non-metastatic OSA. On the whole, several DEPs were identified in metastatic OSA compared with non-metastatic OSA. Further studies on these pathways and relevant genes may help to enhance our understanding of the molecular mechanisms underlying metastasis and may thus aid in

  15. Antimicrobial Protein Candidates from the Thermophilic Geobacillus sp. Strain ZGt-1: Production, Proteomics, and Bioinformatics Analysis.

    PubMed

    Alkhalili, Rawana N; Bernfur, Katja; Dishisha, Tarek; Mamo, Gashaw; Schelin, Jenny; Canbäck, Björn; Emanuelsson, Cecilia; Hatti-Kaul, Rajni

    2016-01-01

    A thermophilic bacterial strain, Geobacillus sp. ZGt-1, isolated from Zara hot spring in Jordan, was capable of inhibiting the growth of the thermophilic G. stearothermophilus and the mesophilic Bacillus subtilis and Salmonella typhimurium on a solid cultivation medium. Antibacterial activity was not observed when ZGt-1 was cultivated in a liquid medium; however, immobilization of the cells in agar beads that were subjected to sequential batch cultivation in the liquid medium at 60 °C showed increasing antibacterial activity up to 14 cycles. The antibacterial activity was lost on protease treatment of the culture supernatant. Concentration of the protein fraction by ammonium sulphate precipitation followed by denaturing polyacrylamide gel electrophoresis separation and analysis of the gel for antibacterial activity against G. stearothermophilus showed a distinct inhibition zone in 15-20 kDa range, suggesting that the active molecule(s) are resistant to denaturation by SDS. Mass spectrometric analysis of the protein bands around the active region resulted in identification of 22 proteins with molecular weight in the range of interest, three of which were new and are here proposed as potential antimicrobial protein candidates by in silico analysis of their amino acid sequences. Mass spectrometric analysis also indicated the presence of partial sequences of antimicrobial enzymes, amidase, and dd-carboxypeptidase. PMID:27548162

  16. Antimicrobial Protein Candidates from the Thermophilic Geobacillus sp. Strain ZGt-1: Production, Proteomics, and Bioinformatics Analysis

    PubMed Central

    Alkhalili, Rawana N.; Bernfur, Katja; Dishisha, Tarek; Mamo, Gashaw; Schelin, Jenny; Canbäck, Björn; Emanuelsson, Cecilia; Hatti-Kaul, Rajni

    2016-01-01

    A thermophilic bacterial strain, Geobacillus sp. ZGt-1, isolated from Zara hot spring in Jordan, was capable of inhibiting the growth of the thermophilic G. stearothermophilus and the mesophilic Bacillus subtilis and Salmonella typhimurium on a solid cultivation medium. Antibacterial activity was not observed when ZGt-1 was cultivated in a liquid medium; however, immobilization of the cells in agar beads that were subjected to sequential batch cultivation in the liquid medium at 60 °C showed increasing antibacterial activity up to 14 cycles. The antibacterial activity was lost on protease treatment of the culture supernatant. Concentration of the protein fraction by ammonium sulphate precipitation followed by denaturing polyacrylamide gel electrophoresis separation and analysis of the gel for antibacterial activity against G. stearothermophilus showed a distinct inhibition zone in 15–20 kDa range, suggesting that the active molecule(s) are resistant to denaturation by SDS. Mass spectrometric analysis of the protein bands around the active region resulted in identification of 22 proteins with molecular weight in the range of interest, three of which were new and are here proposed as potential antimicrobial protein candidates by in silico analysis of their amino acid sequences. Mass spectrometric analysis also indicated the presence of partial sequences of antimicrobial enzymes, amidase and dd-carboxypeptidase. PMID:27548162

  17. Bioinformatics analysis of differentially expressed pathways related to the metastatic characteristics of osteosarcoma.

    PubMed

    Sun, Wei; Ma, Xiaojun; Shen, Jiakang; Yin, Fei; Wang, Chongren; Cai, Zhengdong

    2016-08-01

    In this study, gene expression data of osteosarcoma (OSA) were analyzed to identify metastasis-related biological pathways. Four gene expression data sets (GSE21257, GSE9508, GSE49003 and GSE66673) were downloaded from Gene Expression Omnibus (GEO). An analysis of differentially expressed genes (DEGs) was performed using the Significance Analysis of Microarray (SAM) method. Gene expression levels were converted into scores of pathways by the Functional Analysis of Individual Microarray Expression (FAIME) algorithm and the differentially expressed pathways (DEPs) were then disclosed by a t-test. The distinguishing and prediction ability of the DEPs for metastatic and non-metastatic OSA was further confirmed using the principal component analysis (PCA) method and 3 gene expression data sets (GSE9508, GSE49003 and GSE66673) based on the support vector machines (SVM) model. A total of 616 downregulated and 681 upregulated genes were identified in the data set, GSE21257. The DEGs could not be used to distinguish metastatic OSA from non-metastatic OSA, as shown by PCA. Thus, an analysis of DEPs was further performed, resulting in 14 DEPs, such as NRAS signaling, Toll-like receptor (TLR) signaling, matrix metalloproteinase (MMP) regulation of cytokines and tumor necrosis factor receptor-associated factor (TRAF)-mediated interferon regulatory factor 7 (IRF7) activation. Cluster analysis indicated that these pathways could be used to distinguish between metastatic OSA from non-metastatic OSA. The prediction accuracy was 91, 66.7 and 87.5% for the data sets, GSE9508, GSE49003 and GSE66673, respectively. The results of PCA further validated that the DEPs could be used to distinguish metastatic OSA from non-metastatic OSA. On the whole, several DEPs were identified in metastatic OSA compared with non-metastatic OSA. Further studies on these pathways and relevant genes may help to enhance our understanding of the molecular mechanisms underlying metastasis

  18. FASTAptamer: A Bioinformatic Toolkit for High-throughput Sequence Analysis of Combinatorial Selections

    PubMed Central

    Alam, Khalid K; Chang, Jonathan L; Burke, Donald H

    2015-01-01

    High-throughput sequence (HTS) analysis of combinatorial selection populations accelerates lead discovery and optimization and offers dynamic insight into selection processes. An underlying principle is that selection enriches high-fitness sequences as a fraction of the population, whereas low-fitness sequences are depleted. HTS analysis readily provides the requisite numerical information by tracking the evolutionary trajectory of individual sequences in response to selection pressures. Unlike genomic data, for which a number of software solutions exist, user-friendly tools are not readily available for the combinatorial selections field, leading many users to create custom software. FASTAptamer was designed to address the sequence-level analysis needs of the field. The open source FASTAptamer toolkit counts, normalizes and ranks read counts in a FASTQ file, compares populations for sequence distribution, generates clusters of sequence families, calculates fold-enrichment of sequences throughout the course of a selection and searches for degenerate sequence motifs. While originally designed for aptamer selections, FASTAptamer can be applied to any selection strategy that can utilize next-generation DNA sequencing, such as ribozyme or deoxyribozyme selections, in vivo mutagenesis and various surface display technologies (peptide, antibody fragment, mRNA, etc.). FASTAptamer software, sample data and a user's guide are available for download at http://burkelab.missouri.edu/fastaptamer.html. PMID:25734917

  19. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    PubMed

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety.

  20. Toward the Replacement of Animal Experiments through the Bioinformatics-driven Analysis of 'Omics' Data from Human Cell Cultures.

    PubMed

    Grafström, Roland C; Nymark, Penny; Hongisto, Vesa; Spjuth, Ola; Ceder, Rebecca; Willighagen, Egon; Hardy, Barry; Kaski, Samuel; Kohonen, Pekka

    2015-11-01

    This paper outlines the work for which Roland Grafström and Pekka Kohonen were awarded the 2014 Lush Science Prize. The research activities of the Grafström laboratory have, for many years, covered cancer biology studies, as well as the development and application of toxicity-predictive in vitro models to determine chemical safety. Through the integration of in silico analyses of diverse types of genomics data (transcriptomic and proteomic), their efforts have proved to fit well into the recently-developed Adverse Outcome Pathway paradigm. Genomics analysis within state-of-the-art cancer biology research and Toxicology in the 21st Century concepts share many technological tools. A key category within the Three Rs paradigm is the Replacement of animals in toxicity testing with alternative methods, such as bioinformatics-driven analyses of data obtained from human cell cultures exposed to diverse toxicants. This work was recently expanded within the pan-European SEURAT-1 project (Safety Evaluation Ultimately Replacing Animal Testing), to replace repeat-dose toxicity testing with data-rich analyses of sophisticated cell culture models. The aims and objectives of the SEURAT project have been to guide the application, analysis, interpretation and storage of 'omics' technology-derived data within the service-oriented sub-project, ToxBank. Particularly addressing the Lush Science Prize focus on the relevance of toxicity pathways, a 'data warehouse' that is under continuous expansion, coupled with the development of novel data storage and management methods for toxicology, serve to address data integration across multiple 'omics' technologies. The prize winners' guiding principles and concepts for modern knowledge management of toxicological data are summarised. The translation of basic discovery results ranged from chemical-testing and material-testing data, to information relevant to human health and environmental safety. PMID:26551289

  1. Characterization of microRNAs in Taenia saginata of zoonotic significance by Solexa deep sequencing and bioinformatics analysis.

    PubMed

    Ai, L; Xu, M J; Chen, M X; Zhang, Y N; Chen, S H; Guo, J; Cai, Y C; Zhou, X N; Zhu, X Q; Chen, J X

    2012-06-01

    The beef tapeworm Taenia saginata infects human beings with symptoms ranging from nausea, abdominal discomfort to digestive disturbances and intestinal blockage. In the present study, microRNA (miRNA) expressing profile in adult T. saginata was analyzed using Solexa deep sequencing and bioinformatics analysis. A total of 15.8 million reads was obtained by Solexa sequencing, and 13.3 million clean reads (1.73 million unique sequences) was obtained after removing reads smaller than 18 nt. Ten conserved miRNAs corresponding to 607,382 reads were found when matching the reads against known miRNAs of Schistosoma japonicum in miRBase database. The miR-71 had the most abundant expression in T. saginata, followed by miR-219-5p, but some other common miRNAs such as let-7, miR-40, and miR-103 were not identified in T. saginata. Nucleotide bias analysis found that the known miRNAs showed high bias and the uracil was the dominant nucleotide, particularly at the first and 11th positions which were almost at the beginning and middle of conserved miRNAs. One novel miRNA (Tsa-miR-001) corresponding to ten precursors was identified and confirmed by stem-loop RT-PCR. To our knowledge, this is the first report of miRNA profiles in T. saginata, which will contribute to better understanding of the complex biology of this zoonotic trematode. The reported data of T. saginata miRNAs should provide valuable references for miRNA studies of closed related zoonotic Taenia cestodes such as Taenia solium and Taenia asiatica.

  2. Potential hippocampal genes and pathways involved in Alzheimer's disease: a bioinformatic analysis.

    PubMed

    Zhang, L; Guo, X Q; Chu, J F; Zhang, X; Yan, Z R; Li, Y Z

    2015-06-29

    Alzheimer's disease (AD) is a neurodegenerative disor-der and the most common cause of dementia in elderly people. Nu-merous studies have focused on the dysregulated genes in AD, but the pathogenesis is still unknown. In this study, we explored critical hippocampal genes and pathways that might potentially be involved in the pathogenesis of AD. Four transcriptome datasets for the hip-pocampus of patients with AD were downloaded from ArrayExpress, and the gene signature was identified by integrated analysis of mul-tiple transcriptomes using novel genome-wide relative significance and genome-wide global significance models. A protein-protein interaction network was constructed, and five clusters were selected. The biologi-cal functions and pathways were identified by Gene Ontology and Kyo-to Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. A total of 6994 genes were screened, and the top 300 genes were subjected to further analysis. Four significant KEGG pathways were identified, including oxidative phosphorylation and Parkinson's disease, Huntington's disease, and Alzheimer's disease pathways. The hub network of cluster 1 with the highest average rank value was de-fined. The genes (NDUFB3, NDUFA9, NDUFV1, NDUFV2, NDUFS3, NDUFA10, COX7B, and UQCR1) were considered critical with high degree in cluster 1 as well as being shared by the four significant path-ways. The oxidative phosphorylation process was also involved in the other three pathways and is considered to be relevant to energy-related AD pathology in the hippocampus. This research provides a perspec-tive from which to explore critical genes and pathways for potential AD therapies.

  3. MITOMASTER: a bioinformatics tool for the analysis of mitochondrial DNA sequences.

    PubMed

    Brandon, Marty C; Ruiz-Pesini, Eduardo; Mishmar, Dan; Procaccio, Vincent; Lott, Marie T; Nguyen, Kevin Cuong; Spolim, Syawal; Patil, Upen; Baldi, Pierre; Wallace, Douglas C

    2009-01-01

    We have developed a computer system, MITOMASTER, to make analysis of human mitochondrial DNA (mtDNA) sequences efficient, accurate, and easily available. From imported sequences, the system identifies nucleotide variants, determines the haplogroup, rules out possible pseudogene contamination, identifies novel DNA sequence variants, and evaluates the potential biological significance of each variant. This system should be beneficial for mtDNA analyses of biomedical physicians and investigators, population biologists and forensic scientists. MITOMASTER can be accessed at http://mammag.web.uci.edu/twiki/bin/view/Mitomaster.

  4. [Genome-wide identification and bioinformatic analysis of PPR gene family in tomato].

    PubMed

    Ding, Anming; Li, Ling; Qu, Xu; Sun, Tingting; Chen, Yaqiong; Zong, Peng; Li, Zunqiang; Gong, Daping; Sun, Yuhe

    2014-01-01

    Pentatricopeptide repeats (PPRs) genes constitute one of the largest gene families in plants, which play a broad and essential role in plant growth and development. In this study, the protein sequences annotated by the tomato (S. lycopersicum L.) genome project were screened with the Pfam PPR sequences. A total of 471 putative PPR-encoding genes were identified. Based on the motifs defined in A. thaliana L., protein structure and conserved sequences for each tomato motif were analyzed. We also analyzed phylogenetic relationship, subcellular localization, expression and GO analysis of the identified gene sequences. Our results demonstrate that tomato PPR gene family contains two subfamilies, P and PLS, each accounting for half of the family. PLS subfamily can be divided into four subclasses i.e., PLS, E, E+ and DYW. Each subclass of sequences forms a clade in the phylogenetic tree. The PPR motifs were found highly conserved among plants. The tomato PPR genes were distributed over 12 chromosomes and most of them lack introns. The majority of PPR proteins harbor mitochondrial or chloroplast localization sequences, whereas GO analysis showed that most PPR proteins participate in RNA-related biological processes.

  5. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15

    PubMed Central

    Wang, Jinlan; Chang, Fen

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein–protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  6. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15.

    PubMed

    Wang, Jinlan; Zhang, Zheng; Chang, Fen; Yin, Deling

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein-protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  7. [Gene cloning and bioinformatics analysis of new gene for chlorogenic acid biosynthesis of Lonicera hypoglauca].

    PubMed

    Yu, Shu-lin; Huang, Lu-qi; Yuan, Yuan; Qi, Lin-jie; Liu, Da-hui

    2015-03-01

    To obtain the key genes for chlorogenic acid biosynthesis of Lonicera hypoglauca, four new genes ware obtained from the our dataset of L. hypoglauca. And we also predicted the structure and function of LHPAL4, LHHCT1 , LHHCT2 and LHHCT3 proteins. The phylogenetic tree showed that LHPAL4 was closely related with LHPAL1, LHHCT1 was closely related with LHHCT3, LHHCT2 clustered into a single group. By Real-time PCR to detect the gene expressed level in different organs of L. hypoglauca, we found that the transcripted level of LHPAL4, LHHCT1 and LHHCT3 was the highest in defeat flowers, and the transcripted level of LHHCT2 was the highest in leaves. These result provided a basis to further analysis the mechanism of active ingredients in different organs, as well as the element for in vitro biosynthesis of active ingredients.

  8. A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1

    PubMed Central

    Reisman, Steven; Hatzopoulos, Thomas; Läufer, Konstantin; Thiruvathukal, George K.; Putonti, Catherine

    2016-01-01

    As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. PMID:26819543

  9. A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1.

    PubMed

    Reisman, Steven; Hatzopoulos, Thomas; Läufer, Konstantin; Thiruvathukal, George K; Putonti, Catherine

    2016-01-01

    As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. PMID:26819543

  10. [Gene cloning and bioinformatics analysis of new gene for chlorogenic acid biosynthesis of Lonicera hypoglauca].

    PubMed

    Yu, Shu-lin; Huang, Lu-qi; Yuan, Yuan; Qi, Lin-jie; Liu, Da-hui

    2015-03-01

    To obtain the key genes for chlorogenic acid biosynthesis of Lonicera hypoglauca, four new genes ware obtained from the our dataset of L. hypoglauca. And we also predicted the structure and function of LHPAL4, LHHCT1 , LHHCT2 and LHHCT3 proteins. The phylogenetic tree showed that LHPAL4 was closely related with LHPAL1, LHHCT1 was closely related with LHHCT3, LHHCT2 clustered into a single group. By Real-time PCR to detect the gene expressed level in different organs of L. hypoglauca, we found that the transcripted level of LHPAL4, LHHCT1 and LHHCT3 was the highest in defeat flowers, and the transcripted level of LHHCT2 was the highest in leaves. These result provided a basis to further analysis the mechanism of active ingredients in different organs, as well as the element for in vitro biosynthesis of active ingredients. PMID:26087546

  11. A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1.

    PubMed

    Reisman, Steven; Hatzopoulos, Thomas; Läufer, Konstantin; Thiruvathukal, George K; Putonti, Catherine

    2016-01-01

    As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest.

  12. Identification of key pathways and genes in colorectal cancer using bioinformatics analysis.

    PubMed

    Liang, Bin; Li, Chunning; Zhao, Jianying

    2016-10-01

    Colorectal cancer (CRC) is the most common malignant tumor of digestive system. The aim of this study was to identify gene signatures during CRC and uncover their potential mechanisms. The gene expression profiles of GSE21815 were downloaded from GEO database. The GSE21815 dataset contained 141 samples, including 132 CRC and 9 normal colon epitheliums. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. In total, 3500 DEGs were identified in CRC, including 1370 up-regulated genes and 2130 down-regulated genes. GO analysis results showed that up-regulated DEGs were significantly enriched in biological processes (BP), including cell cycle, cell division, and cell proliferation; the down-regulated DEGs were significantly enriched in biological processes, including immune response, intracellular signaling cascade and defense response. KEGG pathway analysis showed the up-regulated DEGs were enriched in cell cycle and DNA replication, while the down-regulated DEGs were enriched in drug metabolism, metabolism of xenobiotics by cytochrome P450, and retinol metabolism pathways. The top 10 hub genes, GNG2, AGT, SAA1, ADCY5, LPAR1, NMU, IL8, CXCL12, GNAI1, and CCR2 were identified from the PPI network, and sub-networks revealed these genes were involved in significant pathways, including G protein-coupled receptors signaling pathway, gastrin-CREB signaling pathway via PKC and MAPK, and extracellular matrix organization. In conclusion, the present study indicated that the identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of CRC, and might be used as molecular targets and diagnostic biomarkers for the treatment of CRC. PMID:27581154

  13. Biggest challenges in bioinformatics

    PubMed Central

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event. PMID:23492829

  14. Computational Systems Bioinformatics and Bioimaging for Pathway Analysis and Drug Screening

    PubMed Central

    Zhou, Xiaobo; Wong, Stephen T. C.

    2009-01-01

    The premise of today’s drug development is that the mechanism of a disease is highly dependent upon underlying signaling and cellular pathways. Such pathways are often composed of complexes of physically interacting genes, proteins, or biochemical activities coordinated by metabolic intermediates, ions, and other small solutes and are investigated with molecular biology approaches in genomics, proteomics, and metabonomics. Nevertheless, the recent declines in the pharmaceutical industry’s revenues indicate such approaches alone may not be adequate in creating successful new drugs. Our observation is that combining methods of genomics, proteomics, and metabonomics with techniques of bioimaging will systematically provide powerful means to decode or better understand molecular interactions and pathways that lead to disease and potentially generate new insights and indications for drug targets. The former methods provide the profiles of genes, proteins, and metabolites, whereas the latter techniques generate objective, quantitative phenotypes correlating to the molecular profiles and interactions. In this paper, we describe pathway reconstruction and target validation based on the proposed systems biologic approach and show selected application examples for pathway analysis and drug screening. PMID:20011613

  15. A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Family Ectodomain Based on Phylogenetic Information

    PubMed Central

    Rentería, Miguel E.; Gandhi, Neha S.; Vinuesa, Pablo; Helmerhorst, Erik; Mancera, Ricardo L.

    2008-01-01

    The insulin receptor (IR), the insulin-like growth factor 1 receptor (IGF1R) and the insulin receptor-related receptor (IRR) are covalently-linked homodimers made up of several structural domains. The molecular mechanism of ligand binding to the ectodomain of these receptors and the resulting activation of their tyrosine kinase domain is still not well understood. We have carried out an amino acid residue conservation analysis in order to reconstruct the phylogeny of the IR Family. We have confirmed the location of ligand binding site 1 of the IGF1R and IR. Importantly, we have also predicted the likely location of the insulin binding site 2 on the surface of the fibronectin type III domains of the IR. An evolutionary conserved surface on the second leucine-rich domain that may interact with the ligand could not be detected. We suggest a possible mechanical trigger of the activation of the IR that involves a slight ‘twist’ rotation of the last two fibronectin type III domains in order to face the likely location of insulin. Finally, a strong selective pressure was found amongst the IRR orthologous sequences, suggesting that this orphan receptor has a yet unknown physiological role which may be conserved from amphibians to mammals. PMID:18989367

  16. Bioinformatics analysis of biomarkers and transcriptional factor motifs in Down syndrome.

    PubMed

    Kong, X D; Liu, N; Xu, X J

    2014-10-01

    In this study, biomarkers and transcriptional factor motifs were identified in order to investigate the etiology and phenotypic severity of Down syndrome. GSE 1281, GSE 1611, and GSE 5390 were downloaded from the gene expression ominibus (GEO). A robust multiarray analysis (RMA) algorithm was applied to detect differentially expressed genes (DEGs). In order to screen for biological pathways and to interrogate the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, the database for annotation, visualization, and integrated discovery (DAVID) was used to carry out a gene ontology (GO) function enrichment for DEGs. Finally, a transcriptional regulatory network was constructed, and a hypergeometric distribution test was applied to select for significantly enriched transcriptional factor motifs. CBR1, DYRK1A, HMGN1, ITSN1, RCAN1, SON, TMEM50B, and TTC3 were each up-regulated two-fold in Down syndrome samples compared to normal samples; of these, SON and TTC3 were newly reported. CBR1, DYRK1A, HMGN1, ITSN1, RCAN1, SON, TMEM50B, and TTC3 were located on human chromosome 21 (mouse chromosome 16). The DEGs were significantly enriched in macromolecular complex subunit organization and focal adhesion pathways. Eleven significantly enriched transcription factor motifs (PAX5, EGR1, XBP1, SREBP1, OLF1, MZF1, NFY, NFKAPPAB, MYCMAX, NFE2, and RP58) were identified. The DEGs and transcription factor motifs identified in our study provide biomarkers for the understanding of Down syndrome pathogenesis and progression. PMID:25118625

  17. Genomic and bioinformatic analysis of NADPH-cytochrome P450 reductase in Anopheles stephensi (Diptera: Culicidae).

    PubMed

    Suwanchaichinda, C; Brattsten, L B

    2014-01-01

    The cytochrome P450 monooxygenase (P450) enzyme system is a major mechanism of xenobiotic biotransformation. The nicotinamide adenine dinucleotide phosphate (NADPH)-cytochrome P450 reductase (CPR) is required for transfer of electrons from NADPH to P450. One CPR gene was identified in the genome of the malaria-transmitting mosquito Anopheles stephensi Liston (Diptera: Culicidae). The gene encodes a polypeptide containing highly conserved flavin mononucleotide-, flavin adenine dinucleotide-, and NADPH-binding domains, a unique characteristic of the reductase. Phylogenetic analysis revealed that the A. stephensi and other known mosquito CPRs belong to a monophyletic group distinctly separated from other insects in the same order, Diptera. Amino acid residues of CPRs involved in binding of P450 and cytochrome c are conserved between A. stephensi and the Norway rat Rattus norvegicus Berkenhout (Rodentia: Muridae). However, gene structure particularly within the coding region is evidently different between the two organisms. Such difference might arise during the evolution process as also seen in the difference of P450 families and isoforms found in these organisms. CPR in the mosquito A. stephensi is expected to be active and serve as an essential component of the P450 system.

  18. Identification of Immunoreactive Leishmania infantum Protein Antigens to Asymptomatic Dog Sera through Combined Immunoproteomics and Bioinformatics Analysis.

    PubMed

    Agallou, Maria; Athanasiou, Evita; Samiotaki, Martina; Panayotou, George; Karagouni, Evdokia

    2016-01-01

    Leishmania infantum is the etiologic agent of zoonotic visceral leishmaniasis (VL) in countries in the Mediterranean basin, where dogs are the domestic reservoirs and represent important elements in the transmission of the disease. Since the major focal areas of human VL exhibit a high prevalence of seropositive dogs, the control of canine VL could reduce the infection rate in humans. Efforts toward this have focused on the improvement of diagnostic tools, as well as on vaccine development. The identification of parasite antigens including suitable major histocompatibility complex (MHC) class I- and/or II-restricted epitopes is very important since disease protection is characterized by strong and long-lasting CD8+ T and CD4+ Th1 cell-dominated immunity. In the present study, total protein extract from late-log phase L. infantum promastigotes was analyzed by two-dimensional western blots and probed with sera from asymptomatic and symptomatic dogs. A total of 42 protein spots were found to differentially react with IgG from asymptomatic dogs, while 17 of these identified by Coommasie stain were extracted and analyzed. Of these, 21 proteins were identified by mass spectrometry; they were mainly involved in metabolism and stress responses. An in silico analysis predicted that the chaperonin HSP60, dihydrolipoamide dehydrogenase, enolase, cyclophilin 2, cyclophilin 40, and one hypothetical protein contain promiscuous MHCI and/or MHCII epitopes. Our results suggest that the combination of immunoproteomics and bioinformatics analyses is a promising method for the identification of novel candidate antigens for vaccine development or with potential use in the development of sensitive diagnostic tests. PMID:26906226

  19. Bioinformatic analysis of microRNA networks following the activation of the constitutive androstane receptor (CAR) in mouse liver.

    PubMed

    Hao, Ruixin; Su, Shengzhong; Wan, Yinan; Shen, Frank; Niu, Ben; Coslo, Denise M; Albert, Istvan; Han, Xing; Omiecinski, Curtis J

    2016-09-01

    The constitutive androstane receptor (CAR; NR1I3) is a member of the nuclear receptor superfamily that functions as a xenosensor, serving to regulate xenobiotic detoxification, lipid homeostasis and energy metabolism. CAR activation is also a key contributor to the development of chemical hepatocarcinogenesis in mice. The underlying pathways affected by CAR in these processes are complex and not fully elucidated. MicroRNAs (miRNAs) have emerged as critical modulators of gene expression and appear to impact many cellular pathways, including those involved in chemical detoxification and liver tumor development. In this study, we used deep sequencing approaches with an Illumina HiSeq platform to differentially profile microRNA expression patterns in livers from wild type C57BL/6J mice following CAR activation with the mouse CAR-specific ligand activator, 1,4-bis-[2-(3,5,-dichloropyridyloxy)] benzene (TCPOBOP). Bioinformatic analyses and pathway evaluations were performed leading to the identification of 51 miRNAs whose expression levels were significantly altered by TCPOBOP treatment, including mmu-miR-802-5p and miR-485-3p. Ingenuity Pathway Analysis of the differentially expressed microRNAs revealed altered effector pathways, including those involved in liver cell growth and proliferation. A functional network among CAR targeted genes and the affected microRNAs was constructed to illustrate how CAR modulation of microRNA expression may potentially mediate its biological role in mouse hepatocyte proliferation. This article is part of a Special Issue entitled: Xenobiotic nuclear receptors: New Tricks for An Old Dog, edited by Dr. Wen Xie.

  20. Identification of Immunoreactive Leishmania infantum Protein Antigens to Asymptomatic Dog Sera through Combined Immunoproteomics and Bioinformatics Analysis

    PubMed Central

    Samiotaki, Martina; Panayotou, George; Karagouni, Evdokia

    2016-01-01

    Leishmania infantum is the etiologic agent of zoonotic visceral leishmaniasis (VL) in countries in the Mediterranean basin, where dogs are the domestic reservoirs and represent important elements in the transmission of the disease. Since the major focal areas of human VL exhibit a high prevalence of seropositive dogs, the control of canine VL could reduce the infection rate in humans. Efforts toward this have focused on the improvement of diagnostic tools, as well as on vaccine development. The identification of parasite antigens including suitable major histocompatibility complex (MHC) class I- and/or II-restricted epitopes is very important since disease protection is characterized by strong and long-lasting CD8+ T and CD4+ Th1 cell-dominated immunity. In the present study, total protein extract from late-log phase L. infantum promastigotes was analyzed by two-dimensional western blots and probed with sera from asymptomatic and symptomatic dogs. A total of 42 protein spots were found to differentially react with IgG from asymptomatic dogs, while 17 of these identified by Coommasie stain were extracted and analyzed. Of these, 21 proteins were identified by mass spectrometry; they were mainly involved in metabolism and stress responses. An in silico analysis predicted that the chaperonin HSP60, dihydrolipoamide dehydrogenase, enolase, cyclophilin 2, cyclophilin 40, and one hypothetical protein contain promiscuous MHCI and/or MHCII epitopes. Our results suggest that the combination of immunoproteomics and bioinformatics analyses is a promising method for the identification of novel candidate antigens for vaccine development or with potential use in the development of sensitive diagnostic tests. PMID:26906226

  1. Bioinformatic analysis of microRNA networks following the activation of the constitutive androstane receptor (CAR) in mouse liver.

    PubMed

    Hao, Ruixin; Su, Shengzhong; Wan, Yinan; Shen, Frank; Niu, Ben; Coslo, Denise M; Albert, Istvan; Han, Xing; Omiecinski, Curtis J

    2016-09-01

    The constitutive androstane receptor (CAR; NR1I3) is a member of the nuclear receptor superfamily that functions as a xenosensor, serving to regulate xenobiotic detoxification, lipid homeostasis and energy metabolism. CAR activation is also a key contributor to the development of chemical hepatocarcinogenesis in mice. The underlying pathways affected by CAR in these processes are complex and not fully elucidated. MicroRNAs (miRNAs) have emerged as critical modulators of gene expression and appear to impact many cellular pathways, including those involved in chemical detoxification and liver tumor development. In this study, we used deep sequencing approaches with an Illumina HiSeq platform to differentially profile microRNA expression patterns in livers from wild type C57BL/6J mice following CAR activation with the mouse CAR-specific ligand activator, 1,4-bis-[2-(3,5,-dichloropyridyloxy)] benzene (TCPOBOP). Bioinformatic analyses and pathway evaluations were performed leading to the identification of 51 miRNAs whose expression levels were significantly altered by TCPOBOP treatment, including mmu-miR-802-5p and miR-485-3p. Ingenuity Pathway Analysis of the differentially expressed microRNAs revealed altered effector pathways, including those involved in liver cell growth and proliferation. A functional network among CAR targeted genes and the affected microRNAs was constructed to illustrate how CAR modulation of microRNA expression may potentially mediate its biological role in mouse hepatocyte proliferation. This article is part of a Special Issue entitled: Xenobiotic nuclear receptors: New Tricks for An Old Dog, edited by Dr. Wen Xie. PMID:27080131

  2. The Alcohol Dehydrogenase Gene Family in Melon (Cucumis melo L.): Bioinformatic Analysis and Expression Patterns

    PubMed Central

    Jin, Yazhong; Zhang, Chong; Liu, Wei; Tang, Yufan; Qi, Hongyan; Chen, Hao; Cao, Songxiao

    2016-01-01

    Alcohol dehydrogenases (ADH), encoded by multigene family in plants, play a critical role in plant growth, development, adaptation, fruit ripening and aroma production. Thirteen ADH genes were identified in melon genome, including 12 ADHs and one formaldehyde dehydrogenease (FDH), designated CmADH1-12 and CmFDH1, in which CmADH1 and CmADH2 have been isolated in Cantaloupe. ADH genes shared a lower identity with each other at the protein level and had different intron-exon structure at nucleotide level. No typical signal peptides were found in all CmADHs, and CmADH proteins might locate in the cytoplasm. The phylogenetic tree revealed that 13 ADH genes were divided into three groups respectively, namely long-, medium-, and short-chain ADH subfamily, and CmADH1,3-11, which belongs to the medium-chain ADH subfamily, fell into six medium-chain ADH subgroups. CmADH12 may belong to the long-chain ADH subfamily, while CmFDH1 may be a Class III ADH and serve as an ancestral ADH in melon. Expression profiling revealed that CmADH1, CmADH2, CmADH10 and CmFDH1 were moderately or strongly expressed in different vegetative tissues and fruit at medium and late developmental stages, while CmADH8 and CmADH12 were highly expressed in fruit after 20 days. CmADH3 showed preferential expression in young tissues. CmADH4 only had slight expression in root. Promoter analysis revealed several motifs of CmADH genes involved in the gene expression modulated by various hormones, and the response pattern of CmADH genes to ABA, IAA and ethylene were different. These CmADHs were divided into ethylene-sensitive and –insensitive groups, and the functions of CmADHs were discussed. PMID:27242871

  3. Transcriptome Bioinformatical Analysis of Vertebrate Stages of Schistosoma japonicum Reveals Alternative Splicing Events

    PubMed Central

    Wang, Xinye; Xu, Xindong; Lu, Xingyu; Zhang, Yuanbin; Pan, Weiqing

    2015-01-01

    Alternative splicing is a molecular process that contributes greatly to the diversification of proteome and to gene functions. Understanding the mechanisms of stage-specific alternative splicing can provide a better understanding of the development of eukaryotes and the functions of different genes. Schistosoma japonicum is an infectious blood-dwelling trematode with a complex lifecycle that causes the tropical disease schistosomiasis. In this study, we analyzed the transcriptome of Schistosoma japonicum to discover alternative splicing events in this parasite, by applying RNA-seq to cDNA library of adults and schistosomula. Results were validated by RT-PCR and sequencing. We found 11,623 alternative splicing events among 7,099 protein encoding genes and average proportion of alternative splicing events per gene was 42.14%. We showed that exon skip is the most common type of alternative splicing events as found in high eukaryotes, whereas intron retention is the least common alternative splicing type. According to intron boundary analysis, the parasite possesses same intron boundaries as other organisms, namely the classic “GT-AG” rule. And in alternative spliced introns or exons, this rule is less strict. And we have attempted to detect alternative splicing events in genes encoding proteins with signal peptides and transmembrane helices, suggesting that alternative splicing could change subcellular locations of specific gene products. Our results indicate that alternative splicing is prevalent in this parasitic worm, and that the worm is close to its hosts. The revealed secretome involved in alternative splicing implies new perspective into understanding interaction between the parasite and its host. PMID:26407301

  4. The Alcohol Dehydrogenase Gene Family in Melon (Cucumis melo L.): Bioinformatic Analysis and Expression Patterns.

    PubMed

    Jin, Yazhong; Zhang, Chong; Liu, Wei; Tang, Yufan; Qi, Hongyan; Chen, Hao; Cao, Songxiao

    2016-01-01

    Alcohol dehydrogenases (ADH), encoded by multigene family in plants, play a critical role in plant growth, development, adaptation, fruit ripening and aroma production. Thirteen ADH genes were identified in melon genome, including 12 ADHs and one formaldehyde dehydrogenease (FDH), designated CmADH1-12 and CmFDH1, in which CmADH1 and CmADH2 have been isolated in Cantaloupe. ADH genes shared a lower identity with each other at the protein level and had different intron-exon structure at nucleotide level. No typical signal peptides were found in all CmADHs, and CmADH proteins might locate in the cytoplasm. The phylogenetic tree revealed that 13 ADH genes were divided into three groups respectively, namely long-, medium-, and short-chain ADH subfamily, and CmADH1,3-11, which belongs to the medium-chain ADH subfamily, fell into six medium-chain ADH subgroups. CmADH12 may belong to the long-chain ADH subfamily, while CmFDH1 may be a Class III ADH and serve as an ancestral ADH in melon. Expression profiling revealed that CmADH1, CmADH2, CmADH10 and CmFDH1 were moderately or strongly expressed in different vegetative tissues and fruit at medium and late developmental stages, while CmADH8 and CmADH12 were highly expressed in fruit after 20 days. CmADH3 showed preferential expression in young tissues. CmADH4 only had slight expression in root. Promoter analysis revealed several motifs of CmADH genes involved in the gene expression modulated by various hormones, and the response pattern of CmADH genes to ABA, IAA and ethylene were different. These CmADHs were divided into ethylene-sensitive and -insensitive groups, and the functions of CmADHs were discussed. PMID:27242871

  5. Bioinformatic analysis of beta carbonic anhydrase sequences from protozoans and metazoans

    PubMed Central

    2014-01-01

    Background Despite the high prevalence of parasitic infections, and their impact on global health and economy, the number of drugs available to treat them is extremely limited. As a result, the potential consequences of large-scale resistance to any existing drugs are a major concern. A number of recent investigations have focused on the effects of potential chemical inhibitors on bacterial and fungal carbonic anhydrases. Among the five classes of carbonic anhydrases (alpha, beta, gamma, delta and zeta), beta carbonic anhydrases have been reported in most species of bacteria, yeasts, algae, plants, and particular invertebrates (nematodes and insects). To date, there has been a lack of knowledge on the expression and molecular structure of beta carbonic anhydrases in metazoan (nematodes and arthropods) and protozoan species. Methods Here, the identification of novel beta carbonic anhydrases was based on the presence of the highly-conserved amino acid sequence patterns of the active site. A phylogenetic tree was constructed based on codon-aligned DNA sequences. Subcellular localization prediction for each identified invertebrate beta carbonic anhydrase was performed using the TargetP webserver. Results We verified a total of 75 beta carbonic anhydrase sequences in metazoan and protozoan species by proteome-wide searches and multiple sequence alignment. Of these, 52 were novel, and contained highly conserved amino acid residues, which are inferred to form the active site in beta carbonic anhydrases. Mitochondrial targeting peptide analysis revealed that 31 enzymes are predicted with mitochondrial localization; one was predicted to be a secretory enzyme, and the other 43 were predicted to have other undefined cellular localizations. Conclusions These investigations identified 75 beta carbonic anhydrases in metazoan and protozoan species, and among them there were 52 novel sequences that were not previously annotated as beta carbonic anhydrases. Our results will not

  6. Bioinformatic evaluation of L-arginine catabolic pathways in 24 cyanobacteria and transcriptional analysis of genes encoding enzymes of L-arginine catabolism in the cyanobacterium Synechocystis sp. PCC 6803

    PubMed Central

    Schriek, Sarah; Rückert, Christian; Staiger, Dorothee; Pistorius, Elfriede K; Michel, Klaus-Peter

    2007-01-01

    Background So far very limited knowledge exists on L-arginine catabolism in cyanobacteria, although six major L-arginine-degrading pathways have been described for prokaryotes. Thus, we have performed a bioinformatic analysis of possible L-arginine-degrading pathways in cyanobacteria. Further, we chose Synechocystis sp. PCC 6803 for a more detailed bioinformatic analysis and for validation of the bioinformatic predictions on L-arginine catabolism with a transcript analysis. Results We have evaluated 24 cyanobacterial genomes of freshwater or marine strains for the presence of putative L-arginine-degrading enzymes. We identified an L-arginine decarboxylase pathway in all 24 strains. In addition, cyanobacteria have one or two further pathways representing either an arginase pathway or L-arginine deiminase pathway or an L-arginine oxidase/dehydrogenase pathway. An L-arginine amidinotransferase pathway as a major L-arginine-degrading pathway is not likely but can not be entirely excluded. A rather unusual finding was that the cyanobacterial L-arginine deiminases are substantially larger than the enzymes in non-photosynthetic bacteria and that they are membrane-bound. A more detailed bioinformatic analysis of Synechocystis sp. PCC 6803 revealed that three different L-arginine-degrading pathways may in principle be functional in this cyanobacterium. These are (i) an L-arginine decarboxylase pathway, (ii) an L-arginine deiminase pathway, and (iii) an L-arginine oxidase/dehydrogenase pathway. A transcript analysis of cells grown either with nitrate or L-arginine as sole N-source and with an illumination of 50 μmol photons m-2 s-1 showed that the transcripts for the first enzyme(s) of all three pathways were present, but that the transcript levels for the L-arginine deiminase and the L-arginine oxidase/dehydrogenase were substantially higher than that of the three isoenzymes of L-arginine decarboxylase. Conclusion The evaluation of 24 cyanobacterial genomes revealed that

  7. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  8. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  9. Bioinformatics Visualisation Tools: An Unbalanced Picture.

    PubMed

    Broască, Laura; Ancuşa, Versavia; Ciocârlie, Horia

    2016-01-01

    Visualization tools represent a key element in triggering human creativity while being supported with the analysis power of the machine. This paper analyzes free network visualization tools for bioinformatics, frames them in domain specific requirements and compares them. PMID:27577488

  10. Nano-LC-ESI MS/MS analysis of proteins in dried sea dragon Solenognathus hardwickii and bioinformatic analysis of its protein expression profiling.

    PubMed

    Zhang, Dong-Mei; Feng, Li-Xing; Li, Lu; Liu, Miao; Jiang, Bao-Hong; Yang, Min; Li, Guo-Qiang; Wu, Wan-Ying; Guo, De-An; Liu, Xuan

    2016-09-01

    The sea dragon Solenognathus hardwickii has long been used as a traditional Chinese medicine for the treatment of various diseases, such as male impotency. To gain a comprehensive insight into the protein components of the sea dragon, shotgun proteomic analysis of its protein expression profiling was conducted in the present study. Proteins were extracted from dried sea dragon using a trichloroacetic acid/acetone precipitation method and then separated by SDS-PAGE. The protein bands were cut from the gel and digested by trypsin to generate peptide mixture. The peptide fragments were then analyzed using nano liquid chromatography tandem mass spectrometry (nano-LC-ESI MS/MS). 810 proteins and 1 577 peptides were identified in the dried sea dragon. The identified proteins exhibited molecular weight values ranging from 1 900 to 3 516 900 Da and pI values from 3.8 to 12.18. Bioinformatic analysis was conducted using the DAVID Bioinformatics Resources 6.7 Gene Ontology (GO) analysis tool to explore possible functions of the identified proteins. Ascribed functions of the proteins mainly included intracellular non-membrane-bound organelle, non-membrane-bounded organelle, cytoskeleton, structural molecule activity, calcium ion binding and etc. Furthermore, possible signal networks of the identified proteins were predicted using STRING (Search Tool for the Retrieval of Interacting Genes) database. Ribosomal protein synthesis was found to play an important role in the signal network. The results of this study, to best of our knowledge, were the first to provide a reference proteome profile for the sea dragon, and would aid in the understanding of the expression and functions of the identified proteins.

  11. Nano-LC-ESI MS/MS analysis of proteins in dried sea dragon Solenognathus hardwickii and bioinformatic analysis of its protein expression profiling.

    PubMed

    Zhang, Dong-Mei; Feng, Li-Xing; Li, Lu; Liu, Miao; Jiang, Bao-Hong; Yang, Min; Li, Guo-Qiang; Wu, Wan-Ying; Guo, De-An; Liu, Xuan

    2016-09-01

    The sea dragon Solenognathus hardwickii has long been used as a traditional Chinese medicine for the treatment of various diseases, such as male impotency. To gain a comprehensive insight into the protein components of the sea dragon, shotgun proteomic analysis of its protein expression profiling was conducted in the present study. Proteins were extracted from dried sea dragon using a trichloroacetic acid/acetone precipitation method and then separated by SDS-PAGE. The protein bands were cut from the gel and digested by trypsin to generate peptide mixture. The peptide fragments were then analyzed using nano liquid chromatography tandem mass spectrometry (nano-LC-ESI MS/MS). 810 proteins and 1 577 peptides were identified in the dried sea dragon. The identified proteins exhibited molecular weight values ranging from 1 900 to 3 516 900 Da and pI values from 3.8 to 12.18. Bioinformatic analysis was conducted using the DAVID Bioinformatics Resources 6.7 Gene Ontology (GO) analysis tool to explore possible functions of the identified proteins. Ascribed functions of the proteins mainly included intracellular non-membrane-bound organelle, non-membrane-bounded organelle, cytoskeleton, structural molecule activity, calcium ion binding and etc. Furthermore, possible signal networks of the identified proteins were predicted using STRING (Search Tool for the Retrieval of Interacting Genes) database. Ribosomal protein synthesis was found to play an important role in the signal network. The results of this study, to best of our knowledge, were the first to provide a reference proteome profile for the sea dragon, and would aid in the understanding of the expression and functions of the identified proteins. PMID:27667517

  12. Comprehensive human virus screening using high-throughput sequencing with a user-friendly representation of bioinformatics analysis: a pilot study.

    PubMed

    Petty, Tom J; Cordey, Samuel; Padioleau, Ismael; Docquier, Mylène; Turin, Lara; Preynat-Seauve, Olivier; Zdobnov, Evgeny M; Kaiser, Laurent

    2014-09-01

    High-throughput sequencing (HTS) provides the means to analyze clinical specimens in unprecedented molecular detail. While this technology has been successfully applied to virus discovery and other related areas of research, HTS methodology has yet to be exploited for use in a clinical setting for routine diagnostics. Here, a bioinformatics pipeline (ezVIR) was designed to process HTS data from any of the standard platforms and to evaluate the entire spectrum of known human viruses at once, providing results that are easy to interpret and customizable. The pipeline works by identifying the most likely viruses present in the specimen given the sequencing data. Additionally, ezVIR can generate optional reports for strain typing, can create genome coverage histograms, and can perform cross-contamination analysis for specimens prepared in series. In this pilot study, the pipeline was challenged using HTS data from 20 clinical specimens representative of those most often collected and analyzed in daily practice. The specimens (5 cerebrospinal fluid, 7 bronchoalveolar lavage fluid, 5 plasma, 2 serum, and 1 nasopharyngeal aspirate) were originally found to be positive for a diverse range of DNA or RNA viruses by routine molecular diagnostics. The ezVIR pipeline correctly identified 14 of 14 specimens containing viruses with genomes of <40,000 bp, and 4 of 6 specimens positive for large-genome viruses. Although further validation is needed to evaluate sensitivity and to define detection cutoffs, results obtained in this pilot study indicate that the overall detection success rate, coupled with the ease of interpreting the analysis reports, makes it worth considering using HTS for clinical diagnostics.

  13. Bioinformatics for Genome Analysis

    SciTech Connect

    Gary J. Olsen

    2005-06-30

    Nesbo, Boucher and Doolittle (2001) used phylogenetic trees of four taxa to assess whether euryarchaeal genes share a common history. They have suggested that of the 521 genes examined, each of the three possible tree topologies relating the four taxa was supported essentially equal numbers of times. They suggest that this might be the result of numerous horizontal gene transfer events, essentially randomizing the relationships between gene histories (as inferred in the 521 gene trees) and organismal relationships (which would be a single underlying tree). Motivated by the fact that the order in which sequences are added to a multiple sequence alignment influences the alignment, and ultimately inferred tree, they were interested in the extent to which the variations among inferred trees might be due to variations in the alignment order. This bears directly on their efforts to evaluate and improve upon methods of multiple sequence alignment. They set out to analyze the influence of alignment order on the tree inferred for 43 genes shared among these same 4 taxa. Because alignments produced by CLUSTALW are directed by a rooted guide tree (the denderogram), there are 15 possible alignment orders of 4 taxa. For each gene they tested all 15 alignment orders, and as a 16th option, allowed CLUSTALW to generate its own guide tree. If we supply all 15 possible rooted guide trees, they expected that at least one of them should be as good at CLUSTAL's own guide tree, but most of the time they differed (sometimes being better than CLUSTAL's default tree and sometimes being worse). The difference seems to be that the user-supplied tree is not given meaningful branch lengths, which effect the assumed probability of amino acid changes. They examined the practicality of modifying CLUSTALW to improve its treatment of user-supplied guide trees. This work became ever increasing bogged down in finding and repairing minor bugs in the CLUSTALW code. This effort was put on hold as we feel that our other proposed approaches will ultimately be better.

  14. Bioinformatics education in India.

    PubMed

    Kulkarni-Kale, Urmila; Sawant, Sangeeta; Chavan, Vishwas

    2010-11-01

    An account of bioinformatics education in India is presented along with future prospects. Establishment of BTIS network by Department of Biotechnology (DBT), Government of India in the 1980s had been a systematic effort in the development of bioinformatics infrastructure in India to provide services to scientific community. Advances in the field of bioinformatics underpinned the need for well-trained professionals with skills in information technology and biotechnology. As a result, programmes for capacity building in terms of human resource development were initiated. Educational programmes gradually evolved from the organisation of short-term workshops to the institution of formal diploma/degree programmes. A case study of the Master's degree course offered at the Bioinformatics Centre, University of Pune is discussed. Currently, many universities and institutes are offering bioinformatics courses at different levels with variations in the course contents and degree of detailing. BioInformatics National Certification (BINC) examination initiated in 2005 by DBT provides a common yardstick to assess the knowledge and skill sets of students passing out of various institutions. The potential for broadening the scope of bioinformatics to transform it into a data intensive discovery discipline is discussed. This necessitates introduction of amendments in the existing curricula to accommodate the upcoming developments.

  15. Bioinformatics for cancer immunology and immunotherapy.

    PubMed

    Charoentong, Pornpimol; Angelova, Mihaela; Efremova, Mirjana; Gallasch, Ralf; Hackl, Hubert; Galon, Jerome; Trajanoski, Zlatko

    2012-11-01

    Recent mechanistic insights obtained from preclinical studies and the approval of the first immunotherapies has motivated increasing number of academic investigators and pharmaceutical/biotech companies to further elucidate the role of immunity in tumor pathogenesis and to reconsider the role of immunotherapy. Additionally, technological advances (e.g., next-generation sequencing) are providing unprecedented opportunities to draw a comprehensive picture of the tumor genomics landscape and ultimately enable individualized treatment. However, the increasing complexity of the generated data and the plethora of bioinformatics methods and tools pose considerable challenges to both tumor immunologists and clinical oncologists. In this review, we describe current concepts and future challenges for the management and analysis of data for cancer immunology and immunotherapy. We first highlight publicly available databases with specific focus on cancer immunology including databases for somatic mutations and epitope databases. We then give an overview of the bioinformatics methods for the analysis of next-generation sequencing data (whole-genome and exome sequencing), epitope prediction tools as well as methods for integrative data analysis and network modeling. Mathematical models are powerful tools that can predict and explain important patterns in the genetic and clinical progression of cancer. Therefore, a survey of mathematical models for tumor evolution and tumor-immune cell interaction is included. Finally, we discuss future challenges for individualized immunotherapy and suggest how a combined computational/experimental approaches can lead to new insights into the molecular mechanisms of cancer, improved diagnosis, and prognosis of the disease and pinpoint novel therapeutic targets.

  16. Quantitative proteomics and bioinformatic analysis provide new insight into the dynamic response of porcine intestine to Salmonella Typhimurium

    PubMed Central

    Collado-Romero, Melania; Aguilar, Carmen; Arce, Cristina; Lucena, Concepción; Codrea, Marius C.; Morera, Luis; Bendixen, Emoke; Moreno, Ángela; Garrido, Juan J.

    2015-01-01

    The enteropathogen Salmonella Typhimurium (S. Typhimurium) is the most commonly non-typhoideal serotype isolated in pig worldwide. Currently, one of the main sources of human infection is by consumption of pork meat. Therefore, prevention and control of salmonellosis in pigs is crucial for minimizing risks to public health. The aim of the present study was to use isobaric tags for relative and absolute quantification (iTRAQ) to explore differences in the response to Salmonella in two segment of the porcine gut (ileum and colon) along a time course of 1, 2, and 6 days post infection (dpi) with S. Typhimurium. A total of 298 proteins were identified in the infected ileum samples of which, 112 displayed significant expression differences due to Salmonella infection. In colon, 184 proteins were detected in the infected samples of which 46 resulted differentially expressed with respect to the controls. The higher number of changes in protein expression was quantified in ileum at 2 dpi. Further biological interpretation of proteomics data using bioinformatics tools demonstrated that the expression changes in colon were found in proteins involved in cell death and survival, tissue morphology or molecular transport at the early stages and tissue regeneration at 6 dpi. In ileum, however, changes in protein expression were mainly related to immunological and infection diseases, inflammatory response or connective tissue disorders at 1 and 2 dpi. iTRAQ has proved to be a proteomic robust approach allowing us to identify ileum as the earliest response focus upon S. Typhimurium in the porcine gut. In addition, new functions involved in the response to bacteria such as eIF2 signaling, free radical scavengers or antimicrobial peptides (AMP) expression have been identified. Finally, the impairment at of the enterohepatic circulation of bile acids and lipid metabolism by means the under regulation of FABP6 protein and FXR/RXR and LXR/RXR signaling pathway in ileum has been

  17. Acid Rain Analysis by Standard Addition Titration.

    ERIC Educational Resources Information Center

    Ophardt, Charles E.

    1985-01-01

    The standard addition titration is a precise and rapid method for the determination of the acidity in rain or snow samples. The method requires use of a standard buret, a pH meter, and Gran's plot to determine the equivalence point. Experimental procedures used and typical results obtained are presented. (JN)

  18. Electrophoretic analysis of Allium alien addition lines.

    PubMed

    Peffley, E B; Corgan, J N; Horak, K E; Tanksley, S D

    1985-12-01

    Meiotic pairing in an interspecific triploid of Allium cepa and A. fistulosum, 'Delta Giant', exhibits preferential pairing between the two A. cepa genomes, leaving the A. fistulosum genome as univalents. Multivalent pairing involving A. fistulosum chromosomes occurs at a low level, allowing for recombination between the genomes. Ten trisomies were recovered from the backcross of 'Delta Giant' x A. cepa cv., 'Temprana', representing a minimum of four of the eight possible alien addition lines. The alien addition lines possessed different A. fistulosum enzyme markers. Those markers, Adh-1, Idh-1 and Pgm-1 reside on different A. fistulosum chromosomes, whereas Pgi-1 and Idh-1 may be linked. Diploid, trisomic and hyperploid progeny were recovered that exhibited putative pink root resistance. The use of interspecific plants as a means to introgress A. fistulosum genes into A. cepa appears to be successful at both the trisomic and the diploid levels. If introgression can be accomplished using an interspecific triploid such as 'Delta Giant' to generate fertile alien addition lines and subsequent fertile diploids, or if introgression can be accomplished directly at the diploid level, this will have accomplished gene flow that has not been possible at the interspecific diploid level.

  19. CDH1/E-cadherin and solid tumors. An updated gene-disease association analysis using bioinformatics tools.

    PubMed

    Abascal, María Florencia; Besso, María José; Rosso, Marina; Mencucci, María Victoria; Aparicio, Evangelina; Szapiro, Gala; Furlong, Laura Inés; Vazquez-Levin, Mónica Hebe

    2016-02-01

    Cancer is a group of diseases that causes millions of deaths worldwide. Among cancers, Solid Tumors (ST) stand-out due to their high incidence and mortality rates. Disruption of cell-cell adhesion is highly relevant during tumor progression. Epithelial-cadherin (protein: E-cadherin, gene: CDH1) is a key molecule in cell-cell adhesion and an abnormal expression or/and function(s) contributes to tumor progression and is altered in ST. A systematic study was carried out to gather and summarize current knowledge on CDH1/E-cadherin and ST using bioinformatics resources. The DisGeNET database was exploited to survey CDH1-associated diseases. Reported mutations in specific ST were obtained by interrogating COSMIC and IntOGen tools. CDH1 Single Nucleotide Polymorphisms (SNP) were retrieved from the dbSNP database. DisGeNET analysis identified 609 genes annotated to ST, among which CDH1 was listed. Using CDH1 as query term, 26 disease concepts were found, 21 of which were neoplasms-related terms. Using DisGeNET ALL Databases, 172 disease concepts were identified. Of those, 80 ST disease-related terms were subjected to manual curation and 75/80 (93.75%) associations were validated. On selected ST, 489 CDH1 somatic mutations were listed in COSMIC and IntOGen databases. Breast neoplasms had the highest CDH1-mutation rate. CDH1 was positioned among the 20 genes with highest mutation frequency and was confirmed as driver gene in breast cancer. Over 14,000 SNP for CDH1 were found in the dbSNP database. This report used DisGeNET to gather/compile current knowledge on gene-disease association for CDH1/E-cadherin and ST; data curation expanded the number of terms that relate them. An updated list of CDH1 somatic mutations was obtained with COSMIC and IntOGen databases and of SNP from dbSNP. This information can be used to further understand the role of CDH1/E-cadherin in health and disease. PMID:26674224

  20. Mapping the Transcriptome-Wide Landscape of RBP Binding Sites Using gPAR-CLIP-seq: Bioinformatic Analysis.

    PubMed

    Freeberg, Mallory A; Kim, John K

    2016-01-01

    Protein-RNA interactions are integral components of posttranscriptional gene regulatory processes including mRNA processing and assembly of cellular architectures. Dysregulation of RNA-binding protein (RBP) expression or disruptions in RBP-RNA interactions underlie a variety of human pathologies and genetic diseases including cancer and neurodegenerative diseases (reviewed in (Cooper et al., Cell 136(4):777-793, 2009; Darnell, Cancer Res Treat 42(3):125-129, 2010; Lukong et al., Trends Genet 24 (8):416-425, 2008)). Recent studies have uncovered only a small proportion of the extensive RBP-RNA interactome in any organism (Baltz et al., Mol Cell 46(5):674-690, 2012; Castello et al., Cell 149(6):1393-1406, 2012; Freeberg et al., Genome Biol 14(2):R13, 2013; Hogan et al., PLoS Biol 6(10):e255, 2008; Mitchell et al., Nat Struct Mol Biol 20(1):127-133, 2013; Tsvetanova et al. PLoS One 5(9): pii: e12671, 2010; Schueler et al., Genome Biol 15(1):R15, 2014; Silverman et al., Genome Biol 15(1):R3, 2014). To expand our understanding of how RBP-RNA interactions govern RNA-related processes, we developed gPAR-CLIP-seq (global photoactivatable-ribonucleoside-enhanced cross-linking and precipitation followed by deep sequencing) for capturing and sequencing all regions of the Saccharomyces cerevisiae transcriptome bound by RBPs (Freeberg et al., Genome Biol 14(2):R13, 2013). This chapter describes a pipeline for bioinformatic analysis of gPAR-CLIP-seq data. The first half of this pipeline can be implemented by running locally installed programs or by running the programs using the Galaxy platform (Blankenberg et al., Curr Protoc Mol Biol. Chapter 19:Unit 19 10 11-21, 2010; Giardine et al., Genome Res 15 (10):1451-1455, 2005; Goecks et al., Genome Biol 11(8):R86, 2010). The second half of this pipeline can be implemented by user-generated code in any language using the pseudocode provided as a template. PMID:26483018

  1. CDH1/E-cadherin and solid tumors. An updated gene-disease association analysis using bioinformatics tools.

    PubMed

    Abascal, María Florencia; Besso, María José; Rosso, Marina; Mencucci, María Victoria; Aparicio, Evangelina; Szapiro, Gala; Furlong, Laura Inés; Vazquez-Levin, Mónica Hebe

    2016-02-01

    Cancer is a group of diseases that causes millions of deaths worldwide. Among cancers, Solid Tumors (ST) stand-out due to their high incidence and mortality rates. Disruption of cell-cell adhesion is highly relevant during tumor progression. Epithelial-cadherin (protein: E-cadherin, gene: CDH1) is a key molecule in cell-cell adhesion and an abnormal expression or/and function(s) contributes to tumor progression and is altered in ST. A systematic study was carried out to gather and summarize current knowledge on CDH1/E-cadherin and ST using bioinformatics resources. The DisGeNET database was exploited to survey CDH1-associated diseases. Reported mutations in specific ST were obtained by interrogating COSMIC and IntOGen tools. CDH1 Single Nucleotide Polymorphisms (SNP) were retrieved from the dbSNP database. DisGeNET analysis identified 609 genes annotated to ST, among which CDH1 was listed. Using CDH1 as query term, 26 disease concepts were found, 21 of which were neoplasms-related terms. Using DisGeNET ALL Databases, 172 disease concepts were identified. Of those, 80 ST disease-related terms were subjected to manual curation and 75/80 (93.75%) associations were validated. On selected ST, 489 CDH1 somatic mutations were listed in COSMIC and IntOGen databases. Breast neoplasms had the highest CDH1-mutation rate. CDH1 was positioned among the 20 genes with highest mutation frequency and was confirmed as driver gene in breast cancer. Over 14,000 SNP for CDH1 were found in the dbSNP database. This report used DisGeNET to gather/compile current knowledge on gene-disease association for CDH1/E-cadherin and ST; data curation expanded the number of terms that relate them. An updated list of CDH1 somatic mutations was obtained with COSMIC and IntOGen databases and of SNP from dbSNP. This information can be used to further understand the role of CDH1/E-cadherin in health and disease.

  2. Additives

    NASA Technical Reports Server (NTRS)

    Smalheer, C. V.

    1973-01-01

    The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.

  3. Providing web servers and training in Bioinformatics: 2010 update on the Bioinformatics Links Directory.

    PubMed

    Brazas, Michelle D; Yamada, Joseph T; Ouellette, B F Francis

    2010-07-01

    The Links Directory at Bioinformatics.ca continues its collaboration with Nucleic Acids Research to jointly publish and compile a freely accessible, online collection of tools, databases and resource materials for bioinformatics and molecular biology research. The July 2010 Web Server issue of Nucleic Acids Research adds an additional 115 web server tools and 7 updates to the directory at http://bioinformatics.ca/links_directory/, bringing the total number of servers listed close to an impressive 1500 links. The Bioinformatics Links Directory represents an excellent community resource for locating bioinformatic tools and databases to aid one's research, and in this context bioinformatic education needs and initiatives are discussed. A complete list of all links featured in this Nucleic Acids Research 2010 Web Server issue can be accessed online at http://bioinformatics.ca/links_directory/narweb2010/. The 2010 update of the Bioinformatics Links Directory, which includes the Web Server list and summaries, is also available online at the Nucleic Acids Research website, http://nar.oxfordjournals.org/.

  4. No-boundary thinking in bioinformatics research

    PubMed Central

    2013-01-01

    Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT). PMID:24192339

  5. A Bioinformatics Facility for NASA

    NASA Technical Reports Server (NTRS)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  6. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    PubMed Central

    Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  7. A survey of scholarly literature describing the field of bioinformatics education and bioinformatics educational research.

    PubMed

    Magana, Alejandra J; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students' attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  8. Microbial bioinformatics 2020.

    PubMed

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  9. Bioinformatics and its applications in plant biology.

    PubMed

    Rhee, Seung Yon; Dickerson, Julie; Xu, Dong

    2006-01-01

    Bioinformatics plays an essential role in today's plant science. As the amount of data grows exponentially, there is a parallel growth in the demand for tools and methods in data management, visualization, integration, analysis, modeling, and prediction. At the same time, many researchers in biology are unfamiliar with available bioinformatics methods, tools, and databases, which could lead to missed opportunities or misinterpretation of the information. In this review, we describe some of the key concepts, methods, software packages, and databases used in bioinformatics, with an emphasis on those relevant to plant science. We also cover some fundamental issues related to biological sequence analyses, transcriptome analyses, computational proteomics, computational metabolomics, bio-ontologies, and biological databases. Finally, we explore a few emerging research topics in bioinformatics.

  10. A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins

    PubMed Central

    Milano, Teresa

    2016-01-01

    The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH) architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average) that is homologous to fold type-I pyridoxal 5′-phosphate (PLP) dependent enzymes like aspartate aminotransferase (AAT). These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs). Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups. PMID:27446613

  11. A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins.

    PubMed

    Milano, Teresa; Angelaccio, Sebastiana; Tramonti, Angela; Di Salvo, Martino Luigi; Contestabile, Roberto; Pascarella, Stefano

    2016-01-01

    The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH) architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average) that is homologous to fold type-I pyridoxal 5'-phosphate (PLP) dependent enzymes like aspartate aminotransferase (AAT). These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs). Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups. PMID:27446613

  12. Bioinformatic and phylogenetic analysis of the CLAVATA3/EMBRYO-SURROUNDING REGION (CLE) and the CLE-LIKE signal peptide genes in the Pinophyta

    PubMed Central

    2014-01-01

    Background There is a rapidly growing awareness that plant peptide signalling molecules are numerous and varied and they are known to play fundamental roles in angiosperm plant growth and development. Two closely related peptide signalling molecule families are the CLAVATA3-EMBRYO-SURROUNDING REGION (CLE) and CLE-LIKE (CLEL) genes, which encode precursors of secreted peptide ligands that have roles in meristem maintenance and root gravitropism. Progress in peptide signalling molecule research in gymnosperms has lagged behind that of angiosperms. We therefore sought to identify CLE and CLEL genes in gymnosperms and conduct a comparative analysis of these gene families with angiosperms. Results We undertook a meta-analysis of the GenBank/EMBL/DDBJ gymnosperm EST database and the Picea abies and P. glauca genomes and identified 93 putative CLE genes and 11 CLEL genes among eight Pinophyta species, in the genera Cryptomeria, Pinus and Picea. The predicted conifer CLE and CLEL protein sequences had close phylogenetic relationships with their homologues in Arabidopsis. Notably, perfect conservation of the active CLE dodecapeptide in presumed orthologues of the Arabidopsis CLE41/44-TRACHEARY ELEMENT DIFFERENTIATION (TDIF) protein, an inhibitor of tracheary element (xylem) differentiation, was seen in all eight conifer species. We cloned the Pinus radiata CLE41/44-TDIF orthologues. These genes were preferentially expressed in phloem in planta as expected, but unexpectedly, also in differentiating tracheary element (TE) cultures. Surprisingly, transcript abundances of these TE differentiation-inhibitors sharply increased during early TE differentiation, suggesting that some cells differentiate into phloem cells in addition to TEs in these cultures. Applied CLE13 and CLE41/44 peptides inhibited root elongation in Pinus radiata seedlings. We show evidence that two CLEL genes are alternatively spliced via 3′-terminal acceptor exons encoding separate CLEL peptides

  13. BioWarehouse: a bioinformatics database warehouse toolkit

    PubMed Central

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

    2006-01-01

    Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for

  14. SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells.

    PubMed

    Pantano, Lorena; Estivill, Xavier; Martí, Eulàlia

    2010-03-01

    High-throughput sequencing technologies enable direct approaches to catalog and analyze snapshots of the total small RNA content of living cells. Characterization of high-throughput sequencing data requires bioinformatic tools offering a wide perspective of the small RNA transcriptome. Here we present SeqBuster, a highly versatile and reliable web-based toolkit to process and analyze large-scale small RNA datasets. The high flexibility of this tool is illustrated by the multiple choices offered in the pre-analysis for mapping purposes and in the different analysis modules for data manipulation. To overcome the storage capacity limitations of the web-based tool, SeqBuster offers a stand-alone version that permits the annotation against any custom database. SeqBuster integrates multiple analyses modules in a unique platform and constitutes the first bioinformatic tool offering a deep characterization of miRNA variants (isomiRs). The application of SeqBuster to small-RNA datasets of human embryonic stem cells revealed that most miRNAs present different types of isomiRs, some of them being associated to stem cell differentiation. The exhaustive description of the isomiRs provided by SeqBuster could help to identify miRNA-variants that are relevant in physiological and pathological processes. SeqBuster is available at http://estivill_lab.crg.es/seqbuster.

  15. SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells.

    PubMed

    Pantano, Lorena; Estivill, Xavier; Martí, Eulàlia

    2010-03-01

    High-throughput sequencing technologies enable direct approaches to catalog and analyze snapshots of the total small RNA content of living cells. Characterization of high-throughput sequencing data requires bioinformatic tools offering a wide perspective of the small RNA transcriptome. Here we present SeqBuster, a highly versatile and reliable web-based toolkit to process and analyze large-scale small RNA datasets. The high flexibility of this tool is illustrated by the multiple choices offered in the pre-analysis for mapping purposes and in the different analysis modules for data manipulation. To overcome the storage capacity limitations of the web-based tool, SeqBuster offers a stand-alone version that permits the annotation against any custom database. SeqBuster integrates multiple analyses modules in a unique platform and constitutes the first bioinformatic tool offering a deep characterization of miRNA variants (isomiRs). The application of SeqBuster to small-RNA datasets of human embryonic stem cells revealed that most miRNAs present different types of isomiRs, some of them being associated to stem cell differentiation. The exhaustive description of the isomiRs provided by SeqBuster could help to identify miRNA-variants that are relevant in physiological and pathological processes. SeqBuster is available at http://estivill_lab.crg.es/seqbuster. PMID:20008100

  16. An Online Bioinformatics Curriculum

    PubMed Central

    Searls, David B.

    2012-01-01

    Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field. PMID:23028269

  17. Bioinformatics software resources.

    PubMed

    Gilbert, Don

    2004-09-01

    This review looks at internet archives, repositories and lists for obtaining popular and useful biology and bioinformatics software. Resources include collections of free software, services for the collaborative development of new programs, software news media and catalogues of links to bioinformatics software and web tools. Problems with such resources arise from needs for continued curator effort to collect and update these, combined with less than optimal community support, funding and collaboration. Despite some problems, the available software repositories provide needed public access to many tools that are a foundation for analyses in bioscience research efforts.

  18. Altered hippocampal microRNA expression profiles in neonatal rats caused by sevoflurane anesthesia: MicroRNA profiling and bioinformatics target analysis

    PubMed Central

    Ye, Jishi; Zhang, Zongze; Wang, Yanlin; Chen, Chang; Xu, Xing; Yu, Hui; Peng, Mian

    2016-01-01

    Although accumulating evidence has suggested that microRNAs (miRNAs) have a serious impact on cognitive function and are associated with the etiology of several neuropsychiatric disorders, their expression in sevoflurane-induced neurotoxicity in the developing brain has not been characterized. In the present study, the miRNAs expression pattern in neonatal hippocampus samples (24 h after sevoflurane exposure) was investigated and 9 miRNAs were selected, which were associated with brain development and cognition in order to perform a bioinformatic analysis. Previous microfluidic chip assay had detected 29 upregulated and 24 downregulated miRNAs in the neonatal rat hippocampus, of which 7 selected deregulated miRNAs were identified by the quantitative polymerase chain reaction. A total of 85 targets of selected deregulated miRNAs were analyzed using bioinformatics and the main enriched metabolic pathways, mitogen-activated protein kinase and Wnt pathways may have been involved in molecular mechanisms with regard to neuronal cell body, dendrite and synapse. The observations of the present study provided a novel understanding regarding the regulatory mechanism of miRNAs underlying sevoflurane-induced neurotoxicity, therefore benefitting the improvement of the prevention and treatment strategies of volatile anesthetics related neurotoxicity. PMID:27588052

  19. Bioinformatics and School Biology

    ERIC Educational Resources Information Center

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  20. Bioinformatics tools and database resources for systems genetics analysis in mice--a short review and an evaluation of future needs.

    PubMed

    Durrant, Caroline; Swertz, Morris A; Alberts, Rudi; Arends, Danny; Möller, Steffen; Mott, Richard; Prins, Pjotr; van der Velde, K Joeri; Jansen, Ritsert C; Schughart, Klaus

    2012-03-01

    During a meeting of the SYSGENET working group 'Bioinformatics', currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future compatibility of software and exchange of already developed software modules, a strong recommendation was made by the group to integrate HAPPY and R/qtl analysis toolboxes, GeneNetwork and XGAP database platforms, and TIQS and xQTL processing platforms. R should be used as the principal computer language for QTL data analysis in all platforms and a 'cloud' should be used for software dissemination to the community. Furthermore, the working group recommended that all data models and software source code should be made visible in public repositories to allow a coordinated effort on the use of common data structures and file formats.

  1. Highlighting computations in bioscience and bioinformatics: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB07)

    PubMed Central

    Lu, Guoqing; Ni, Jun

    2008-01-01

    The Second Symposium on Computations in Bioinformatics and Bioscience (SCBB07) was held in Iowa City, Iowa, USA, on August 13–15, 2007. This annual event attracted dozens of bioinformatics professionals and students, who are interested in solving emerging computational problems in bioscience, from China, Japan, Taiwan and the United States. The Scientific Committee of the symposium selected 18 peer-reviewed papers for publication in this supplemental issue of BMC Bioinformatics. These papers cover a broad spectrum of topics in computational biology and bioinformatics, including DNA, protein and genome sequence analysis, gene expression and microarray analysis, computational proteomics and protein structure classification, systems biology and machine learning. PMID:18541044

  2. Regulatory bioinformatics for food and drug safety.

    PubMed

    Healy, Marion J; Tong, Weida; Ostroff, Stephen; Eichler, Hans-Georg; Patak, Alex; Neuspiel, Margaret; Deluyker, Hubert; Slikker, William

    2016-10-01

    "Regulatory Bioinformatics" strives to develop and implement a standardized and transparent bioinformatic framework to support the implementation of existing and emerging technologies in regulatory decision-making. It has great potential to improve public health through the development and use of clinically important medical products and tools to manage the safety of the food supply. However, the application of regulatory bioinformatics also poses new challenges and requires new knowledge and skill sets. In the latest Global Coalition on Regulatory Science Research (GCRSR) governed conference, Global Summit on Regulatory Science (GSRS2015), regulatory bioinformatics principles were presented with respect to global trends, initiatives and case studies. The discussion revealed that datasets, analytical tools, skills and expertise are rapidly developing, in many cases via large international collaborative consortia. It also revealed that significant research is still required to realize the potential applications of regulatory bioinformatics. While there is significant excitement in the possibilities offered by precision medicine to enhance treatments of serious and/or complex diseases, there is a clear need for further development of mechanisms to securely store, curate and share data, integrate databases, and standardized quality control and data analysis procedures. A greater understanding of the biological significance of the data is also required to fully exploit vast datasets that are becoming available. The application of bioinformatics in the microbiological risk analysis paradigm is delivering clear benefits both for the investigation of food borne pathogens and for decision making on clinically important treatments. It is recognized that regulatory bioinformatics will have many beneficial applications by ensuring high quality data, validated tools and standardized processes, which will help inform the regulatory science community of the requirements

  3. Regulatory bioinformatics for food and drug safety.

    PubMed

    Healy, Marion J; Tong, Weida; Ostroff, Stephen; Eichler, Hans-Georg; Patak, Alex; Neuspiel, Margaret; Deluyker, Hubert; Slikker, William

    2016-10-01

    "Regulatory Bioinformatics" strives to develop and implement a standardized and transparent bioinformatic framework to support the implementation of existing and emerging technologies in regulatory decision-making. It has great potential to improve public health through the development and use of clinically important medical products and tools to manage the safety of the food supply. However, the application of regulatory bioinformatics also poses new challenges and requires new knowledge and skill sets. In the latest Global Coalition on Regulatory Science Research (GCRSR) governed conference, Global Summit on Regulatory Science (GSRS2015), regulatory bioinformatics principles were presented with respect to global trends, initiatives and case studies. The discussion revealed that datasets, analytical tools, skills and expertise are rapidly developing, in many cases via large international collaborative consortia. It also revealed that significant research is still required to realize the potential applications of regulatory bioinformatics. While there is significant excitement in the possibilities offered by precision medicine to enhance treatments of serious and/or complex diseases, there is a clear need for further development of mechanisms to securely store, curate and share data, integrate databases, and standardized quality control and data analysis procedures. A greater understanding of the biological significance of the data is also required to fully exploit vast datasets that are becoming available. The application of bioinformatics in the microbiological risk analysis paradigm is delivering clear benefits both for the investigation of food borne pathogens and for decision making on clinically important treatments. It is recognized that regulatory bioinformatics will have many beneficial applications by ensuring high quality data, validated tools and standardized processes, which will help inform the regulatory science community of the requirements

  4. Towards a career in bioinformatics

    PubMed Central

    2009-01-01

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508

  5. Additive interaction in survival analysis: use of the additive hazards model.

    PubMed

    Rod, Naja Hulvej; Lange, Theis; Andersen, Ingelise; Marott, Jacob Louis; Diderichsen, Finn

    2012-09-01

    It is a widely held belief in public health and clinical decision-making that interventions or preventive strategies should be aimed at patients or population subgroups where most cases could potentially be prevented. To identify such subgroups, deviation from additivity of absolute effects is the relevant measure of interest. Multiplicative survival models, such as the Cox proportional hazards model, are often used to estimate the association between exposure and risk of disease in prospective studies. In Cox models, deviations from additivity have usually been assessed by surrogate measures of additive interaction derived from multiplicative models-an approach that is both counter-intuitive and sometimes invalid. This paper presents a straightforward and intuitive way of assessing deviation from additivity of effects in survival analysis by use of the additive hazards model. The model directly estimates the absolute size of the deviation from additivity and provides confidence intervals. In addition, the model can accommodate both continuous and categorical exposures and models both exposures and potential confounders on the same underlying scale. To illustrate the approach, we present an empirical example of interaction between education and smoking on risk of lung cancer. We argue that deviations from additivity of effects are important for public health interventions and clinical decision-making, and such estimations should be encouraged in prospective studies on health. A detailed implementation guide of the additive hazards model is provided in the appendix.

  6. Feature selection in bioinformatics

    NASA Astrophysics Data System (ADS)

    Wang, Lipo

    2012-06-01

    In bioinformatics, there are often a large number of input features. For example, there are millions of single nucleotide polymorphisms (SNPs) that are genetic variations which determine the dierence between any two unrelated individuals. In microarrays, thousands of genes can be proled in each test. It is important to nd out which input features (e.g., SNPs or genes) are useful in classication of a certain group of people or diagnosis of a given disease. In this paper, we investigate some powerful feature selection techniques and apply them to problems in bioinformatics. We are able to identify a very small number of input features sucient for tasks at hand and we demonstrate this with some real-world data.

  7. Forensic DNA and bioinformatics.

    PubMed

    Bianchi, Lucia; Liò, Pietro

    2007-03-01

    The field of forensic science is increasingly based on biomolecular data and many European countries are establishing forensic databases to store DNA profiles of crime scenes of known offenders and apply DNA testing. The field is boosted by statistical and technological advances such as DNA microarray sequencing, TFT biosensors, machine learning algorithms, in particular Bayesian networks, which provide an effective way of evidence organization and inference. The aim of this article is to discuss the state of art potentialities of bioinformatics in forensic DNA science. We also discuss how bioinformatics will address issues related to privacy rights such as those raised from large scale integration of crime, public health and population genetic susceptibility-to-diseases databases.

  8. Bioinformatic and functional analysis of RNA secondary structure elements among different genera of human and animal caliciviruses

    PubMed Central

    Simmonds, Peter; Karakasiliotis, Ioannis; Bailey, Dalan; Chaudhry, Yasmin; Evans, David J.; Goodfellow, Ian G.

    2008-01-01

    The mechanism and role of RNA structure elements in the replication and translation of Caliciviridae remains poorly understood. Several algorithmically independent methods were used to predict secondary structures within the Norovirus, Sapovirus, Vesivirus and Lagovirus genera. All showed profound suppression of synonymous site variability (SSSV) at genomic 5′ ends and the start of the sub-genomic (sg) transcript, consistent with evolutionary constraints from underlying RNA structure. A newly developed thermodynamic scanning method predicted RNA folding mapping precisely to regions of SSSV and at the genomic 3′ end. These regions contained several evolutionarily conserved RNA secondary structures, of variable size and positions. However, all caliciviruses contained 3′ terminal hairpins, and stem–loops in the anti-genomic strand invariably six bases upstream of the sg transcript, indicating putative roles as sg promoters. Using the murine norovirus (MNV) reverse-genetics system, disruption of 5′ end stem–loops produced ∼15- to 20-fold infectivity reductions, while disruption of the RNA structure in the sg promoter region and at the 3′ end entirely destroyed replication ability. Restoration of infectivity by repair mutations in the sg promoter region confirmed a functional role for the RNA secondary structure, not the sequence. This study provides comprehensive bioinformatic resources for future functional studies of MNV and other caliciviruses. PMID:18319285

  9. Bioinformatic analysis for allergenicity assessment of Bacillus thuringiensis Cry proteins expressed in insect-resistant food crops.

    PubMed

    Randhawa, Gurinder Jit; Singh, Monika; Grover, Monendra

    2011-02-01

    The novel proteins introduced into the genetically modified (GM) crops need to be evaluated for the potential allergenicity before their introduction into the food chain to address the safety concerns of consumers. At present, there is no single definitive test that can be relied upon to predict allergic response in humans to a new protein; hence a composite approach to allergic response prediction is described in this study. The present study reports on the evaluation of the Cry proteins, encoded by cry1Ac, cry1Ab, cry2Ab, cry1Ca, cry1Fa/cry1Ca hybrid, being expressed in Bt food crops that are under field trials in India, for potential allergenic cross-reactivity using bioinformatics search tools. The sequence identity of amino acids was analyzed using FASTA3 of AllergenOnline version 10.0 and BLASTX of NCBI Entrez to identify any potential sequence matches to allergen proteins. As a step further in the detection of allergens, an independent database of domains in the allergens available in the AllergenOnline database was also developed. The results indicated no significant alignment and similarity of Cry proteins at domain level with any of the known allergens revealing that there is no potential risk of allergenic cross-reactivity.

  10. The Cytotoxicity Mechanism of 6-Shogaol-Treated HeLa Human Cervical Cancer Cells Revealed by Label-Free Shotgun Proteomics and Bioinformatics Analysis

    PubMed Central

    Liu, Qun; Peng, Yong-Bo; Qi, Lian-Wen; Cheng, Xiao-Lan; Xu, Xiao-Jun; Liu, Le-Le; Liu, E-Hu; Li, Ping

    2012-01-01

    Cervical cancer is one of the most common cancers among women in the world. 6-Shogaol is a natural compound isolated from the rhizome of ginger (Zingiber officinale). In this paper, we demonstrated that 6-shogaol induced apoptosis and G2/M phase arrest in human cervical cancer HeLa cells. Endoplasmic reticulum stress and mitochondrial pathway were involved in 6-shogaol-mediated apoptosis. Proteomic analysis based on label-free strategy by liquid chromatography chip quadrupole time-of-flight mass spectrometry was subsequently proposed to identify, in a non-target-biased manner, the molecular changes in cellular proteins in response to 6-shogaol treatment. A total of 287 proteins were differentially expressed in response to 24 h treatment with 15 μM 6-shogaol in HeLa cells. Significantly changed proteins were subjected to functional pathway analysis by multiple analyzing software. Ingenuity pathway analysis (IPA) suggested that 14-3-3 signaling is a predominant canonical pathway involved in networks which may be significantly associated with the process of apoptosis and G2/M cell cycle arrest induced by 6-shogaol. In conclusion, this work developed an unbiased protein analysis strategy by shotgun proteomics and bioinformatics analysis. Data observed provide a comprehensive analysis of the 6-shogaol-treated HeLa cell proteome and reveal protein alterations that are associated with its anticancer mechanism. PMID:23243437

  11. Bioinformatic analysis of the distribution of inorganic carbon transporters and prospective targets for bioengineering to increase Ci uptake by cyanobacteria.

    PubMed

    Gaudana, Sandeep B; Zarzycki, Jan; Moparthi, Vamsi K; Kerfeld, Cheryl A

    2015-10-01

    Cyanobacteria have evolved a carbon-concentrating mechanism (CCM) which has enabled them to inhabit diverse environments encompassing a range of inorganic carbon (Ci: [Formula: see text] and CO2) concentrations. Several uptake systems facilitate inorganic carbon accumulation in the cell, which can in turn be fixed by ribulose 1,5-bisphosphate carboxylase/oxygenase. Here we survey the distribution of genes encoding known Ci uptake systems in cyanobacterial genomes and, using a pfam- and gene context-based approach, identify in the marine (alpha) cyanobacteria a heretofore unrecognized number of putative counterparts to the well-known Ci transporters of beta cyanobacteria. In addition, our analysis shows that there is a huge repertoire of transport systems in cyanobacteria of unknown function, many with homology to characterized Ci transporters. These can be viewed as prospective targets for conversion into ancillary Ci transporters through bioengineering. Increasing intracellular Ci concentration coupled with efforts to increase carbon fixation will be beneficial for the downstream conversion of fixed carbon into value-added products including biofuels. In addition to CCM transporter homologs, we also survey the occurrence of rhodopsin homologs in cyanobacteria, including bacteriorhodopsin, a class of retinal-binding, light-activated proton pumps. Because they are light driven and because of the apparent ease of altering their ion selectivity, we use this as an example of re-purposing an endogenous transporter for the augmentation of Ci uptake by cyanobacteria and potentially chloroplasts.

  12. E2F, HSF2, and miR-26 in thyroid carcinoma: bioinformatic analysis of RNA-sequencing data.

    PubMed

    Lu, J C; Zhang, Y P

    2016-01-01

    In this study, we examined the molecular mechanism of thyroid carcinoma (THCA) using bioinformatics. RNA-sequencing data of THCA (N = 498) and normal thyroid tissue (N = 59) were downloaded from The Cancer Genome Atlas. Next, gene expression levels were calculated using the TCC package and differentially expressed genes (DEGs) were identified using the edgeR package. A co-expression network was constructed using the EBcoexpress package and visualized by Cytoscape, and functional and pathway enrichment of DEGs in the co-expression network was analyzed with DAVID and KOBAS 2.0. Moreover, modules in the co-expression network were identified and annotated using MCODE and BiNGO plugins. Small-molecule drugs were analyzed using the cMAP database, and miRNAs and transcription factors regulating DEGs were identified by WebGestalt. A total of 254 up-regulated and 59 down-regulated DEGs were identified between THCA samples and controls. DEGs enriched in biological process terms were related to cell adhesion, death, and growth and negatively correlated with various small-molecule drugs. The co-expression network of the DEGs consisted of hub genes (ITGA3, TIMP1, KRT19, and SERPINA1) and one module (JUN, FOSB, and EGR1). Furthermore, 5 miRNAs and 5 transcription factors were identified, including E2F, HSF2, and miR-26. miR-26 may participate in THCA by targeting CITED1 and PLA2R1; E2F may participate in THCA by regulating ITGA3, TIMP1, KRT19, EGR1, and JUN; HSF2 may be involved in THCA development by regulating SERPINA1 and FOSB; and small-molecule drugs may have anti-THCA effects. Our results provide novel directions for mechanistic studies and drug design of THCA. PMID:26985959

  13. How might ZNF804A variants influence risk for schizophrenia and bipolar disorder? A literature review, synthesis, and bioinformatic analysis.

    PubMed

    Hess, Jonathan L; Glatt, Stephen J

    2014-01-01

    The gene that encodes zinc finger protein 804A (ZNF804A) became a candidate risk gene for schizophrenia (SZ) after surpassing genome-wide significance thresholds in replicated genome-wide association scans and meta-analyses. Much remains unknown about this reported gene expression regulator; however, preliminary work has yielded insights into functional and biological effects of ZNF804A by targeting its regulatory activities in vitro and by characterizing allele-specific interactions with its risk-conferring single nucleotide polymorphisms (SNPs). There is now strong epidemiologic evidence for a role of ZNF804A polymorphisms in both SZ and bipolar disorder (BD); however, functional links between implicated variants and susceptible biological states have not been solidified. Here we briefly review the genetic evidence implicating ZNF804A polymorphisms as genetic risk factors for both SZ and BD, and discuss the potential functional consequences of these variants on the regulation of ZNF804A and its downstream targets. Empirical work and predictive bioinformatic analyses of the alternate alleles of the two most strongly implicated ZNF804A polymorphisms suggest they might alter the affinity of the gene sequence for DNA- and/or RNA-binding proteins, which might in turn alter expression levels of the gene or particular ZNF804A isoforms. Future work should focus on clarifying the critical periods and cofactors regulating these genetic influences on ZNF804A expression, as well as the downstream biological consequences of an imbalance in the expression of ZNF804A and its various mRNA isoforms.

  14. LXtoo: an integrated live Linux distribution for the bioinformatics community

    PubMed Central

    2012-01-01

    Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356

  15. Bioinformatics and cancer: an essential alliance.

    PubMed

    Dopazo, Joaquín

    2006-06-01

    Modern research in cancer has been revolutionized by the introduction of new high-throughput methodologies such as DNA microarrays. Keeping the pace with these technologies, the bioinformatics offer new solutions for data analysis and, what is more important, it permits to formulate a new class of hypothesis inspired in systems biology, more oriented to blocks of functionally-related genes. Although software implementations for this new methodologies is new there are some options already available. Bioinformatic solutions for other high-throughput techniques such as array-CGH of large-scale genotyping is also revised.

  16. Genomics Politics through Space and Time: The Case of Bioinformatics in Brazil.

    PubMed

    Bicudo, Edison

    2016-01-01

    The emergence of scientific disciplines, as well as the policies aimed to steer them, have geographical implications. This becomes visible in areas such as genomics and related fields. In this paper, the relation between scientific evolution, political decisions and geographical configuration is studied. The recent formation of bioinformatics in Brazil is focused on. The study involves an analysis of data collected on the website of CNPq, a funding agency attached to the Ministry of Science and Technology. Furthermore, I conducted fieldwork in four cities, interviewing 15 bioinformaticians. In the history of Brazilian bioinformatics, three periods can be identified. In the first period (1900-1996), bioinformatics was actually absent, but biology research groups were formed which would subsequently explore bioinformatics. The second period (1997-2006) was marked by the emergence of the discipline and geographical concentration of major research groups in the southern part of Brazil. A third period can be pointed to (2007-2014), in which political choices have turned geographical diffusion and institutional equality into a national target. As a consequence of the recent shifts, genomics and bioinformatics researchers have been involved in a debate, some defending the existence of few specialized research and sequencing platforms, whereas others welcoming the constitution of a scientific scenario based on decentralized platforms. I defend an intermediate solution, whereby some places would be selected to be genomics hubs. This would fit the regional diversity of this vast country, in addition to tackling the scientific weaknesses of the northern area.

  17. Genomics Politics through Space and Time: The Case of Bioinformatics in Brazil.

    PubMed

    Bicudo, Edison

    2016-01-01

    The emergence of scientific disciplines, as well as the policies aimed to steer them, have geographical implications. This becomes visible in areas such as genomics and related fields. In this paper, the relation between scientific evolution, political decisions and geographical configuration is studied. The recent formation of bioinformatics in Brazil is focused on. The study involves an analysis of data collected on the website of CNPq, a funding agency attached to the Ministry of Science and Technology. Furthermore, I conducted fieldwork in four cities, interviewing 15 bioinformaticians. In the history of Brazilian bioinformatics, three periods can be identified. In the first period (1900-1996), bioinformatics was actually absent, but biology research groups were formed which would subsequently explore bioinformatics. The second period (1997-2006) was marked by the emergence of the discipline and geographical concentration of major research groups in the southern part of Brazil. A third period can be pointed to (2007-2014), in which political choices have turned geographical diffusion and institutional equality into a national target. As a consequence of the recent shifts, genomics and bioinformatics researchers have been involved in a debate, some defending the existence of few specialized research and sequencing platforms, whereas others welcoming the constitution of a scientific scenario based on decentralized platforms. I defend an intermediate solution, whereby some places would be selected to be genomics hubs. This would fit the regional diversity of this vast country, in addition to tackling the scientific weaknesses of the northern area. PMID:26890397

  18. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes.

  19. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  20. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  1. Phage-display library biopanning and bioinformatic analysis yielded a high-affinity peptide to inflamed vascular endothelium both in vitro and in vivo.

    PubMed

    Yang, Min; Liu, Chenwu; Niu, Maochang; Hu, Yonghe; Guo, Mingyang; Zhang, Jun; Luo, Yong; Yuan, Weili; Yang, Mei; Yun, Mingdong; Guo, Linling; Yan, Jiao; Liu, Defang; Liu, Jinghua; Jiang, Yong

    2014-01-28

    Vascular inflammation is considered the primary pathological condition occurring in many chronic diseases. To detect the inflamed endothelium via imaging analysis or guide the drug to target lesions is therefore important for early diagnosis and treatment of vascular inflammatory diseases. In this study, we obtained a novel peptide NTTTH through high throughout biopanning and bioinformatic analysis. In vitro studies indicated that NTTTH homologs could especially target inflamed vascular endothelial cells, as imaging quantitative analysis indicated that the mean of integrated optical density (MIOD) and mean of stained area (MSA) were significantly higher versus control (P<0.05). In vivo studies showed that, after intravenous injection of enhanced green fluorescent protein (EGFP)-labeled NTTTH homologs into the lipopolysaccharide (LPS)-inflamed mice for 30min, NTTTH homologs were distributed in highly vascularized and inflamed organs like liver and kidney. As a control, little fluorescence could be detected in mice injected with EGFP alone. Cryosection showed that NTTTH homologs especially targeted inflamed vasculatures but not normal ones. We did not detect fluorescence signal in either normal or inflamed mice which were injected with EGFP alone. The results suggested the role of NTTTH homologs in guiding the targeted binding of EGFP to inflamed vasculature and the potential usage for imaging detection and drug delivery.

  2. Company strategies for using bioinformatics.

    PubMed

    Bains, W

    1996-08-01

    Bioinformatics enables biotechnology companies to access and analyse their growing databases of experimental results, and to exploit public data from genome programmes and other sources. Traditionally occupying the domain of a 'guru' supplying answers to infrequent research questions, corporate bioinformatics is breaking down under the flood of data. New, more robust, professional and expandable systems will give scientists effective access to new tools. This review outlines how companies have evolved beyond the 'guru', and have organized their bioinformatics by acquiring or developing bioinformatics resources. It also describes why the biologist must be central to this process, and why this is a problem for computer professionals to solve, not for 'gurus'.

  3. An Integrated Bioinformatics Analysis Reveals Divergent Evolutionary Pattern of Oil Biosynthesis in High- and Low-Oil Plants.

    PubMed

    Zhang, Li; Wang, Shi-Bo; Li, Qi-Gang; Song, Jian; Hao, Yu-Qi; Zhou, Ling; Zheng, Huan-Quan; Dunwell, Jim M; Zhang, Yuan-Ming

    2016-01-01

    Seed oils provide a renewable source of food, biofuel and industrial raw materials that is important for humans. Although many genes and pathways for acyl-lipid metabolism have been identified, little is known about whether there is a specific mechanism for high-oil content in high-oil plants. Based on the distinct differences in seed oil content between four high-oil dicots (20~50%) and three low-oil grasses (<3%), comparative genome, transcriptome and differential expression analyses were used to investigate this mechanism. Among 4,051 dicot-specific soybean genes identified from 252,443 genes in the seven species, 54 genes were shown to directly participate in acyl-lipid metabolism, and 93 genes were found to be associated with acyl-lipid metabolism. Among the 93 dicot-specific genes, 42 and 27 genes, including CBM20-like SBDs and GPT2, participate in carbohydrate degradation and transport, respectively. 40 genes highly up-regulated during seed oil rapid accumulation period are mainly involved in initial fatty acid synthesis, triacylglyceride assembly and oil-body formation, for example, ACCase, PP, DGAT1, PDAT1, OLEs and STEROs, which were also found to be differentially expressed between high- and low-oil soybean accessions. Phylogenetic analysis revealed distinct differences of oleosin in patterns of gene duplication and loss between high-oil dicots and low-oil grasses. In addition, seed-specific GmGRF5, ABI5 and GmTZF4 were predicted to be candidate regulators in seed oil accumulation. This study facilitates future research on lipid biosynthesis and potential genetic improvement of seed oil content. PMID:27159078

  4. An Integrated Bioinformatics Analysis Reveals Divergent Evolutionary Pattern of Oil Biosynthesis in High- and Low-Oil Plants

    PubMed Central

    Zhang, Li; Wang, Shi-Bo; Li, Qi-Gang; Song, Jian; Hao, Yu-Qi; Zhou, Ling; Zheng, Huan-Quan; Dunwell, Jim M.; Zhang, Yuan-Ming

    2016-01-01

    Seed oils provide a renewable source of food, biofuel and industrial raw materials that is important for humans. Although many genes and pathways for acyl-lipid metabolism have been identified, little is known about whether there is a specific mechanism for high-oil content in high-oil plants. Based on the distinct differences in seed oil content between four high-oil dicots (20~50%) and three low-oil grasses (<3%), comparative genome, transcriptome and differential expression analyses were used to investigate this mechanism. Among 4,051 dicot-specific soybean genes identified from 252,443 genes in the seven species, 54 genes were shown to directly participate in acyl-lipid metabolism, and 93 genes were found to be associated with acyl-lipid metabolism. Among the 93 dicot-specific genes, 42 and 27 genes, including CBM20-like SBDs and GPT2, participate in carbohydrate degradation and transport, respectively. 40 genes highly up-regulated during seed oil rapid accumulation period are mainly involved in initial fatty acid synthesis, triacylglyceride assembly and oil-body formation, for example, ACCase, PP, DGAT1, PDAT1, OLEs and STEROs, which were also found to be differentially expressed between high- and low-oil soybean accessions. Phylogenetic analysis revealed distinct differences of oleosin in patterns of gene duplication and loss between high-oil dicots and low-oil grasses. In addition, seed-specific GmGRF5, ABI5 and GmTZF4 were predicted to be candidate regulators in seed oil accumulation. This study facilitates future research on lipid biosynthesis and potential genetic improvement of seed oil content. PMID:27159078

  5. Pattern recognition in bioinformatics.

    PubMed

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained.

  6. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    PubMed

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  7. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    PubMed

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  8. Computed Tomography Inspection and Analysis for Additive Manufacturing Components

    NASA Technical Reports Server (NTRS)

    Beshears, Ronald D.

    2016-01-01

    Computed tomography (CT) inspection was performed on test articles additively manufactured from metallic materials. Metallic AM and machined wrought alloy test articles with programmed flaws were inspected using a 2MeV linear accelerator based CT system. Performance of CT inspection on identically configured wrought and AM components and programmed flaws was assessed using standard image analysis techniques to determine the impact of additive manufacturing on inspectability of objects with complex geometries.

  9. Additivity in the Analysis and Design of HIV Protease Inhibitors

    PubMed Central

    Jorissen, Robert N.; Kiran Kumar Reddy, G. S.; Ali, Akbar; Altman, Michael D.; Chellappan, Sripriya; Anjum, Saima G.; Tidor, Bruce; Schiffer, Celia A.; Rana, Tariq M.; Gilson, Michael K.

    2009-01-01

    We explore the applicability of an additive treatment of substituent effects to the analysis and design of HIV protease inhibitors. Affinity data for a set of inhibitors with a common chemical framework were analyzed to provide estimates of the free energy contribution of each chemical substituent. These estimates were then used to design new inhibitors, whose high affinities were confirmed by synthesis and experimental testing. Derivations of additive models by least-squares and ridge-regression methods were found to yield statistically similar results. The additivity approach was also compared with standard molecular descriptor-based QSAR; the latter was not found to provide superior predictions. Crystallographic studies of HIV protease-inhibitor complexes help explain the perhaps surprisingly high degree of substituent additivity in this system, and allow some of the additivity coefficients to be rationalized on a structural basis. PMID:19193159

  10. Virtual Bioinformatics Distance Learning Suite

    ERIC Educational Resources Information Center

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  11. Bioinformatics meets parasitology.

    PubMed

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention.

  12. Channelrhodopsins: a bioinformatics perspective.

    PubMed

    Del Val, Coral; Royuela-Flor, José; Milenkovic, Stefan; Bondar, Ana-Nicoleta

    2014-05-01

    Channelrhodopsins are microbial-type rhodopsins that function as light-gated cation channels. Understanding how the detailed architecture of the protein governs its dynamics and specificity for ions is important, because it has the potential to assist in designing site-directed channelrhodopsin mutants for specific neurobiology applications. Here we use bioinformatics methods to derive accurate alignments of channelrhodopsin sequences, assess the sequence conservation patterns and find conserved motifs in channelrhodopsins, and use homology modeling to construct three-dimensional structural models of channelrhodopsins. The analyses reveal that helices C and D of channelrhodopsins contain Cys, Ser, and Thr groups that can engage in both intra- and inter-helical hydrogen bonds. We propose that these polar groups participate in inter-helical hydrogen-bonding clusters important for the protein conformational dynamics and for the local water interactions. This article is part of a Special Issue entitled: Retinal Proteins - You can teach an old dog new tricks. PMID:24252597

  13. Virtual bioinformatics distance learning suite*.

    PubMed

    Tolvanen, Martti; Vihinen, Mauno

    2004-05-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material over the Internet. Currently, we provide two fully computer-based courses, "Introduction to Bioinformatics" and "Bioinformatics in Functional Genomics." Here we will discuss the application of distance learning in bioinformatics training and our experiences gained during the 3 years that we have run the courses, with about 400 students from a number of universities. The courses are available at bioinf.uta.fi.

  14. Optimal Multicomponent Analysis Using the Generalized Standard Addition Method.

    ERIC Educational Resources Information Center

    Raymond, Margaret; And Others

    1983-01-01

    Describes an experiment on the simultaneous determination of chromium and magnesium by spectophotometry modified to include the Generalized Standard Addition Method computer program, a multivariate calibration method that provides optimal multicomponent analysis in the presence of interference and matrix effects. Provides instructions for…

  15. Suppression subtractive hybridization (SSH) combined with bioinformatics method: an integrated functional annotation approach for analysis of differentially expressed immune-genes in insects.

    PubMed

    Badapanda, Chandan

    2013-01-01

    The suppression subtractive hybridization (SSH) approach, a PCR based approach which amplifies differentially expressed cDNAs (complementary DNAs), while simultaneously suppressing amplification of common cDNAs, was employed to identify immuneinducible genes in insects. This technique has been used as a suitable tool for experimental identification of novel genes in eukaryotes as well as prokaryotes; whose genomes have been sequenced, or the species whose genomes have yet to be sequenced. In this article, I have proposed a method for in silico functional characterization of immune-inducible genes from insects. Apart from immune-inducible genes from insects, this method can be applied for the analysis of genes from other species, starting from bacteria to plants and animals. This article is provided with a background of SSH-based method taking specific examples from innate immune-inducible genes in insects, and subsequently a bioinformatics pipeline is proposed for functional characterization of newly sequenced genes. The proposed workflow presented here, can also be applied for any newly sequenced species generated from Next Generation Sequencing (NGS) platforms.

  16. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5.

    PubMed

    Mefford, Megan E; Kunstman, Kevin; Wolinsky, Steven M; Gabuzda, Dana

    2015-07-01

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120-CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues.

  17. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5.

    PubMed

    Mefford, Megan E; Kunstman, Kevin; Wolinsky, Steven M; Gabuzda, Dana

    2015-07-01

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120-CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. PMID:25797607

  18. PhyloToAST: Bioinformatics tools for species-level analysis and visualization of complex microbial datasets.

    PubMed

    Dabdoub, Shareef M; Fellows, Megan L; Paropkari, Akshay D; Mason, Matthew R; Huja, Sarandeep S; Tsigarida, Alexandra A; Kumar, Purnima S

    2016-01-01

    The 16S rRNA gene is widely used for taxonomic profiling of microbial ecosystems; and recent advances in sequencing chemistry have allowed extremely large numbers of sequences to be generated from minimal amounts of biological samples. Analysis speed and resolution of data to species-level taxa are two important factors in large-scale explorations of complex microbiomes using 16S sequencing. We present here new software, Phylogenetic Tools for Analysis of Species-level Taxa (PhyloToAST), that completely integrates with the QIIME pipeline to improve analysis speed, reduce primer bias (requiring two sequencing primers), enhance species-level analysis, and add new visualization tools. The code is free and open source, and can be accessed at http://phylotoast.org. PMID:27357721

  19. PhyloToAST: Bioinformatics tools for species-level analysis and visualization of complex microbial datasets

    PubMed Central

    Dabdoub, Shareef M.; Fellows, Megan L.; Paropkari, Akshay D.; Mason, Matthew R.; Huja, Sarandeep S.; Tsigarida, Alexandra A.; Kumar, Purnima S.

    2016-01-01

    The 16S rRNA gene is widely used for taxonomic profiling of microbial ecosystems; and recent advances in sequencing chemistry have allowed extremely large numbers of sequences to be generated from minimal amounts of biological samples. Analysis speed and resolution of data to species-level taxa are two important factors in large-scale explorations of complex microbiomes using 16S sequencing. We present here new software, Phylogenetic Tools for Analysis of Species-level Taxa (PhyloToAST), that completely integrates with the QIIME pipeline to improve analysis speed, reduce primer bias (requiring two sequencing primers), enhance species-level analysis, and add new visualization tools. The code is free and open source, and can be accessed at http://phylotoast.org. PMID:27357721

  20. A global analysis of soil acidification caused by nitrogen addition

    NASA Astrophysics Data System (ADS)

    Tian, Dashuan; Niu, Shuli

    2015-02-01

    Nitrogen (N) deposition-induced soil acidification has become a global problem. However, the response patterns of soil acidification to N addition and the underlying mechanisms remain far from clear. Here, we conducted a meta-analysis of 106 studies to reveal global patterns of soil acidification in responses to N addition. We found that N addition significantly reduced soil pH by 0.26 on average globally. However, the responses of soil pH varied with ecosystem types, N addition rate, N fertilization forms, and experimental durations. Soil pH decreased most in grassland, whereas boreal forest was not observed a decrease to N addition in soil acidification. Soil pH decreased linearly with N addition rates. Addition of urea and NH4NO3 contributed more to soil acidification than NH4-form fertilizer. When experimental duration was longer than 20 years, N addition effects on soil acidification diminished. Environmental factors such as initial soil pH, soil carbon and nitrogen content, precipitation, and temperature all influenced the responses of soil pH. Base cations of Ca2+, Mg2+ and K+ were critical important in buffering against N-induced soil acidification at the early stage. However, N addition has shifted global soils into the Al3+ buffering phase. Overall, this study indicates that acidification in global soils is very sensitive to N deposition, which is greatly modified by biotic and abiotic factors. Global soils are now at a buffering transition from base cations (Ca2+, Mg2+ and K+) to non-base cations (Mn2+ and Al3+). This calls our attention to care about the limitation of base cations and the toxic impact of non-base cations for terrestrial ecosystems with N deposition.

  1. Systematic analysis of the association between gut flora and obesity through high-throughput sequencing and bioinformatics approaches.

    PubMed

    Chiu, Chih-Min; Huang, Wei-Chih; Weng, Shun-Long; Tseng, Han-Chi; Liang, Chao; Wang, Wei-Chi; Yang, Ting; Yang, Tzu-Ling; Weng, Chen-Tsung; Chang, Tzu-Hao; Huang, Hsien-Da

    2014-01-01

    Eighty-one stool samples from Taiwanese were collected for analysis of the association between the gut flora and obesity. The supervised analysis showed that the most, abundant genera of bacteria in normal samples (from people with a body mass index (BMI) ≤ 24) were Bacteroides (27.7%), Prevotella (19.4%), Escherichia (12%), Phascolarctobacterium (3.9%), and Eubacterium (3.5%). The most abundant genera of bacteria in case samples (with a BMI ≥ 27) were Bacteroides (29%), Prevotella (21%), Escherichia (7.4%), Megamonas (5.1%), and Phascolarctobacterium (3.8%). A principal coordinate analysis (PCoA) demonstrated that normal samples were clustered more compactly than case samples. An unsupervised analysis demonstrated that bacterial communities in the gut were clustered into two main groups: N-like and OB-like groups. Remarkably, most normal samples (78%) were clustered in the N-like group, and most case samples (81%) were clustered in the OB-like group (Fisher's P  value = 1.61E - 07). The results showed that bacterial communities in the gut were highly associated with obesity. This is the first study in Taiwan to investigate the association between human gut flora and obesity, and the results provide new insights into the correlation of bacteria with the rising trend in obesity.

  2. Agile parallel bioinformatics workflow management using Pwrake

    PubMed Central

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  3. [Kinetic analysis of additive effect on desulfurization activity].

    PubMed

    Han, Kui-hua; Zhao, Jian-li; Lu, Chun-mei; Wang, Yong-zheng; Zhao, Gai-ju; Cheng, Shi-qing

    2006-02-01

    The additive effects of A12O3, Fe2O3 and MnCO3 on CaO sulfation kinetics were investigated by thermogravimetic analysis method and modified grain model. The activation energy (Ea) and the pre-exponential factor (k0) of surface reaction, the activation energy (Ep) and the pre-exponential factor (D0) of product layer diffusion reaction were calculated according to the model. Additions of MnCO3 can enhance the initial reaction rate, product layer diffusion and the final CaO conversion of sorbents, the effect mechanism of which is similar to that of Fe2O3. The method based isokinetic temperature Ts and activation energy can not estimate the contribution of additive to the sulfation reactivity, the rate constant of the surface reaction (k), and the effective diffusivity of reactant in the product layer (Ds) under certain experimental conditions can reflect the effect of additives on the activation. Unstoichiometric metal oxide may catalyze the surface reaction and promote the diffusivity of reactant in the product layer by the crystal defect and distinct diffusion of cation and anion. According to the mechanism and effect of additive on the sulfation, the effective temperature and the stoichiometric relation of reaction, it is possible to improve the utilization of sorbent by compounding more additives to the calcium-based sorbent.

  4. Evaluation of Simultaneous Nutrient and COD Removal with Polyhydroxybutyrate (PHB) Accumulation Using Mixed Microbial Consortia under Anoxic Condition and Their Bioinformatics Analysis

    PubMed Central

    Jena, Jyotsnarani; Kumar, Ravindra; Dixit, Anshuman; Pandey, Sony; Das, Trupti

    2015-01-01

    Simultaneous nitrate-N, phosphate and COD removal was evaluated from synthetic waste water using mixed microbial consortia in an anoxic environment under various initial carbon load (ICL) in a batch scale reactor system. Within 6 hours of incubation, enriched DNPAOs (Denitrifying Polyphosphate Accumulating Microorganisms) were able to remove maximum COD (87%) at 2g/L of ICL whereas maximum nitrate-N (97%) and phosphate (87%) removal along with PHB accumulation (49 mg/L) was achieved at 8 g/L of ICL. Exhaustion of nitrate-N, beyond 6 hours of incubation, had a detrimental effect on COD and phosphate removal rate. Fresh supply of nitrate-N to the reaction medium, beyond 6 hours, helped revive the removal rates of both COD and phosphate. Therefore, it was apparent that in spite of a high carbon load, maximum COD and nutrient removal can be maintained, with adequate nitrate-N availability. Denitrifying condition in the medium was evident from an increasing pH trend. PHB accumulation by the mixed culture was directly proportional to ICL; however the time taken for accumulation at higher ICL was more. Unlike conventional EBPR, PHB depletion did not support phosphate accumulation in this case. The unique aspect of all the batch studies were PHB accumulation was observed along with phosphate uptake and nitrate reduction under anoxic conditions. Bioinformatics analysis followed by pyrosequencing of the mixed culture DNA from the seed sludge revealed the dominance of denitrifying population, such as Corynebacterium, Rhodocyclus and Paraccocus (Alphaproteobacteria and Betaproteobacteria). Rarefaction curve indicated complete bacterial population and corresponding number of OTUs through sequence analysis. Chao1 and Shannon index (H’) was used to study the diversity of sampling. “UCI95” and “LCI95” indicated 95% confidence level of upper and lower values of Chao1 for each distance. Values of Chao1 index supported the results of rarefaction curve. PMID:25689047

  5. The haloarchaeal MCM proteins: bioinformatic analysis and targeted mutagenesis of the β7-β8 and β9-β10 hairpin loops and conserved zinc binding domain cysteines

    PubMed Central

    Kristensen, Tatjana P.; Maria Cherian, Reeja; Gray, Fiona C.; MacNeill, Stuart A.

    2014-01-01

    The hexameric MCM complex is the catalytic core of the replicative helicase in eukaryotic and archaeal cells. Here we describe the first in vivo analysis of archaeal MCM protein structure and function relationships using the genetically tractable haloarchaeon Haloferax volcanii as a model system. Hfx. volcanii encodes a single MCM protein that is part of the previously identified core group of haloarchaeal MCM proteins. Three structural features of the N-terminal domain of the Hfx. volcanii MCM protein were targeted for mutagenesis: the β7-β8 and β9-β10 β-hairpin loops and putative zinc binding domain. Five strains carrying single point mutations in the β7-β8 β-hairpin loop were constructed, none of which displayed impaired cell growth under normal conditions or when treated with the DNA damaging agent mitomycin C. However, short sequence deletions within the β7-β8 β-hairpin were not tolerated and neither was replacement of the highly conserved residue glutamate 187 with alanine. Six strains carrying paired alanine substitutions within the β9-β10 β-hairpin loop were constructed, leading to the conclusion that no individual amino acid within that hairpin loop is absolutely required for MCM function, although one of the mutant strains displays greatly enhanced sensitivity to mitomycin C. Deletions of two or four amino acids from the β9-β10 β-hairpin were tolerated but mutants carrying larger deletions were inviable. Similarly, it was not possible to construct mutants in which any of the conserved zinc binding cysteines was replaced with alanine, underlining the likely importance of zinc binding for MCM function. The results of these studies demonstrate the feasibility of using Hfx. volcanii as a model system for reverse genetic analysis of archaeal MCM protein function and provide important confirmation of the in vivo importance of conserved structural features identified by previous bioinformatic, biochemical and structural studies. PMID:24723920

  6. Bioinformatics Analysis Reveals Abundant Short Alpha-Helices as a Common Structural Feature of Oomycete RxLR Effector Proteins

    PubMed Central

    Ye, Wenwu; Wang, Yang; Wang, Yuanchao

    2015-01-01

    RxLR effectors represent one of the largest and most diverse effector families in oomycete plant pathogens. These effectors have attracted enormous attention since they can be delivered inside the plant cell and manipulates host immunity. With the exceptions of a signal peptide and the following RxLR-dEER and C-terminal W/Y/L motifs identified from the sequences themselves, nearly no functional domains have been found. Recently, protein structures of several RxLRs were revealed to comprise alpha-helical bundle repeats. However, approximately half of all RxLRs lack obvious W/Y/L motifs, which are associated with helical structures. In this study, secondary structure prediction of the putative RxLR proteins was performed. We found that the C-terminus of the majority of these RxLR proteins, irrespective of the presence of W/Y/L motifs, contains abundant short alpha-helices. Since a large-scale experimental determination of protein structures has been difficult to date, results of the current study extend our understanding on the oomycete RxLR effectors in protein secondary structures from individual members to the entire family. Moreover, we identified less alpha-helix-rich proteins from secretomes of several oomycete and fungal organisms in which RxLRs have not been identified, providing additional evidence that these organisms are unlikely to harbor RxLR-like proteins. Therefore, these results provide additional information that will aid further studies on the evolution and functional mechanisms of RxLR effectors. PMID:26252511

  7. The Bioinformatics Analysis of miRNAs Signatures Differentially Expressed in HER2(+) Versus HER2(−) Breast Cancers

    PubMed Central

    Nie, Weiwei; Jin, Lei; Wang, Yanru; Wang, Zexing

    2013-01-01

    Abstract Objective To identify the signatures of miRNAs differentially expressed in HER2(+) versus HER2(−) breast cancers that accurately predict the HER2 status of breast cancer, and to provide further insight into breast cancer therapy. Methods By the methods of literature search, aberrant expressed miRNAs were collected. By target prediction algorithm of TargetScan and PicTar and the data enrichment analysis, target gene sets of miRNAs differentially expressed in HER2(+) versus HER2(−) breast cancers were built. Then, using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) database, the function modules of Gene Ontology categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) and BIOCARTA pathway, biological functions and signaling pathways that are probably regulated by miRNAs, were analyzed. Results We got five sets of miRNAs expressed in different HER2 status of breast cancers finally. The five sets of data contain 22; 32; 3; 38; and 62 miRNAs, respectively. After miRNAs target prediction and data enrichment, 5,734; 22,409; 1,142; 22,293; and 43,460 target genes of five miRNA sets were collected. Gene ontology analysis found these genes may be involved in transcription, protein transport, angiogenesis, and apoptosis. Moreover, certain KEGG and BIOCARTA signaling pathways related toHER2 status were found. Conclusion Using TargetScan and PicTar for data enrichment, and DAVID database, Gene Ontology categories, KEGG and BIOCARTA pathway for analysis of miRNAs different expression, we conducted a new method for biological interpretation of miRNA profiling data in HER2(+) versus HER2(−) breast cancers. It may improve understanding the regulatory roles of miRNAs in different molecular subtypes of breast cancers. Therefore, it is beneficial to improve the accuracy of experimental efforts to breast cancer and potential therapeutic targets. PMID:23009584

  8. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    ERIC Educational Resources Information Center

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  9. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    PubMed

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  10. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    PubMed Central

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  11. Genome-wide identification and evolutionary analysis of algal LPAT genes involved in TAG biosynthesis using bioinformatic approaches.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar

    2014-12-01

    Lysophosphatidyl acyltransferase (LPAT) is one of the major triacylglycerol synthesis enzymes, controlling the metabolic flow of lysophosphatidic acid to phosphatidic acid. Experimental studies in Arabidopsis have shown that LPAT activity is exhibited primarily by three distinct isoforms, namely the plastid-located LPAT1, the endoplasmic reticulum-located LPAT2, and the soluble isoform of LPAT (solLPAT). In this study, 24 putative genes representing all LPAT isoforms were identified from the analysis of 11 complete genomes including green algae, red algae, diatoms and higher plants. We observed LPAT1 and solLPAT genes to be ubiquitously present in nearly all genomes examined, whereas LPAT2 genes to have evolved more recently in the plant lineage. Phylogenetic analysis indicated that LPAT1, LPAT2 and solLPAT have convergently evolved through separate evolutionary paths and belong to three different gene families, which was further evidenced by their wide divergence at gene structure and sequence level. The genome distribution supports the hypothesis that each gene encoding a LPAT is not duplicated. Mapping of exon-intron structure of LPAT genes to the domain structure of proteins across different algal and plant species indicates that exon shuffling plays no role in the evolution of LPAT genes. Besides the previously defined motifs, several conserved consensus sequences were discovered which could be useful to distinguish different LPAT isoforms. Taken together, this study will enable the generation of experimental approximations to better understand the functional role of algal LPAT in lipid accumulation.

  12. Bioinformatics analysis of thousands of TCGA tumors to determine the involvement of epigenetic regulators in human cancer

    PubMed Central

    2015-01-01

    Background Many cancer cells show distorted epigenetic landscapes. The Cancer Genome Atlas (TCGA) project profiles thousands of tumors, allowing the discovery of somatic alterations in the epigenetic machinery and the identification of potential cancer drivers among members of epigenetic protein families. Methods We integrated mutation, expression, and copy number data from 5943 tumors from 13 cancer types to train a classification model that predicts the likelihood of being an oncogene (OG), tumor suppressor (TSG) or neutral gene (NG). We applied this predictor to epigenetic regulator genes (ERGs), and used differential expression and correlation network analysis to identify dysregulated ERGs along with co-expressed cancer genes. Furthermore, we quantified global proteomic changes by mass spectrometry after EZH2 inhibition. Results Mutation-based classifiers uncovered the OG-like profile of DNMT3A and TSG-like profiles for several ERGs. Differential gene expression and correlation network analyses revealed that EZH2 is the most significantly over-expressed ERG in cancer and is co-regulated with a cell cycle network. Proteomic analysis showed that EZH2 inhibition induced down-regulation of cell cycle regulators in lymphoma cells. Conclusions Using classical driver genes to train an OG/TSG predictor, we determined the most predictive features at the gene level. Our predictor uncovered one OG and several TSGs among ERGs. Expression analyses elucidated multiple dysregulated ERGs including EZH2 as member of a co-expressed cell cycle network. PMID:26110843

  13. RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database.

    PubMed

    Field, Helen I; Fenyö, David; Beavis, Ronald C

    2002-01-01

    RADARS, a rapid, automated, data archiving and retrieval software system for high-throughput proteomic mass spectral data processing and storage, is described. The majority of mass spectrometer data files are compatible with RADARS, for consistent processing. The system automatically takes unprocessed data files, identifies proteins via in silico database searching, then stores the processed data and search results in a relational database suitable for customized reporting. The system is robust, used in 24/7 operation, accessible to multiple users of an intranet through a web browser, may be monitored by Virtual Private Network, and is secure. RADARS is scalable for use on one or many computers, and is suited to multiple processor systems. It can incorporate any local database in FASTA format, and can search protein and DNA databases online. A key feature is a suite of visualisation tools (many available gratis), allowing facile manipulation of spectra, by hand annotation, reanalysis, and access to all procedures. We also described the use of Sonar MS/MS, a novel, rapid search engine requiring 40 MB RAM per process for searches against a genomic or EST database translated in all six reading frames. RADARS reduces the cost of analysis by its efficient algorithms: Sonar MS/MS can identifiy proteins without accurate knowledge of the parent ion mass and without protein tags. Statistical scoring methods provide close-to-expert accuracy and brings robust data analysis to the non-expert user.

  14. MicroRNAs and cardiac sarcoplasmic reticulum calcium ATPase-2 in human myocardial infarction: expression and bioinformatic analysis

    PubMed Central

    2012-01-01

    Background Cardiac sarco(endo)plasmic reticulum calcium ATPase-2 (SERCA2) plays one of the central roles in myocardial contractility. Both, SERCA2 mRNA and protein are reduced in myocardial infarction (MI), but the correlation has not been always observed. MicroRNAs (miRNAs) act by targeting 3'-UTR mRNA, causing translational repression in physiological and pathological conditions, including cardiovascular diseases. One of the aims of our study was to identify miRNAs that could influence SERCA2 expression in human MI. Results The protein SERCA2 was decreased and 43 miRNAs were deregulated in infarcted myocardium compared to corresponding remote myocardium, analyzed by western blot and microRNA microarrays, respectively. All the samples were stored as FFPE tissue and in RNAlater. miRNAs binding prediction to SERCA2 including four prediction algorithms (TargetScan, PicTar, miRanda and mirTarget2) identified 213 putative miRNAs. TAM and miRNApath annotation of deregulated miRNAs identified 18 functional and 21 diseased states related to heart diseases, and association of the half of the deregulated miRNAs to SERCA2. Free-energy of binding and flanking regions (RNA22, RNAfold) was calculated for 10 up-regulated miRNAs from microarray analysis (miR-122, miR-320a/b/c/d, miR-574-3p/-5p, miR-199a, miR-140, and miR-483), and nine miRNAs deregulated from microarray analysis were used for validation with qPCR (miR-21, miR-122, miR-126, miR-1, miR-133, miR-125a/b, and miR-98). Based on qPCR results, the comparison between FFPE and RNAlater stored tissue samples, between Sybr Green and TaqMan approaches, as well as between different reference genes were also performed. Conclusion Combing all the results, we identified certain miRNAs as potential regulators of SERCA2; however, further functional studies are needed for verification. Using qPCR, we confirmed deregulation of nine miRNAs in human MI, and show that qPCR normalization strategy is important for the outcome of mi

  15. [Bioinformatics in Cancer Clinical Sequencing -- An Emerging Field of Cancer Personalized Medicine].

    PubMed

    Kato, Mamoru

    2016-04-01

    Thus far, bioinformatics has mostly been applied in basic science research. It was initially used to analyze protein sequences in unicellular organisms, aiding discoveries in basic biology. Following the completion of human genome sequencing, it has also facilitated numerous discoveries in basic medicine. Recently, several clinical applications of bioinformatics have been reported. Most relevantly, bioinformatics has been applied to clinical sequencing - an emerging field of personalized medicine, or precision medicine. In this review, I will introduce basic techniques of bioinformatics used in clinical sequencing, avoiding excessive technical details. I will also discuss future directions for data analysis using bioinformatics in the field of personalized medicine.

  16. Bioinformatics analysis of hepatitis C virus genotype 2a-induced human hepatocellular carcinoma in Huh7 cells

    PubMed Central

    Xu, Ping; Wu, Meiying; Chen, Hui; Xu, Junchi; Wu, Minjuan; Li, Ming; Qian, Feng; Xu, Junhua

    2016-01-01

    Hepatocellular carcinoma (HCC) is a liver cancer that could be induced by hepatitis C virus genotype 2a Japanese fulminant hepatitis-1 (JFH-1) strain. The aim of this study was to investigate the molecular mechanisms of HCC. The microarray data GSE20948 includes 14 JFH-1- and 14 mock (equal volume of medium [control])-infected Huh7 samples. The data were downloaded from the Gene Expression Omnibus. After data processing, soft cluster analyses were performed to identify co-regulated genes with similar temporal expression patterns. Functional and pathway enrichment analyses, as well as functional annotation analysis, were performed. Subsequently, combined networks of protein–protein interaction network, microRNA regulatory network, and transcriptional regulatory network were constructed. Hub nodes, modules, and five clusters of co-regulated genes were also identified. In total, 173 up and 207 down co-regulated genes were separately identified in JFH-1-infected Huh7 cells compared with those of control cells. Functional enrichment analysis indicated that up co-regulated genes were related to skeletal system morphogenesis and neuron differentiation and down co-regulated genes were related to steroid/cholesterol/sterol metabolisms. Hub genes (such as IRF1, GBP1, ICAM1, Foxa1, DHCR7, HMGCS2, and MSMO1) were identified. Transcription factors IRF1 and Foxa1 were the targets of miR-130a, miR-17-5p, and miR-20a. PPARGC1A was targeted by miR-29 family, and MSMO1 was the target of miR-23 family. Hub nodes (such as IRF1, GBP1, ICAM1, Foxa1, DHCR7, HMGCS2, and MSMO1) and microRNAs might be used as candidate biomarkers of JFH-1-infected HCC. PMID:26811688

  17. Bioinformatics analysis of hepatitis C virus genotype 2a-induced human hepatocellular carcinoma in Huh7 cells.

    PubMed

    Xu, Ping; Wu, Meiying; Chen, Hui; Xu, Junchi; Wu, Minjuan; Li, Ming; Qian, Feng; Xu, Junhua

    2016-01-01

    Hepatocellular carcinoma (HCC) is a liver cancer that could be induced by hepatitis C virus genotype 2a Japanese fulminant hepatitis-1 (JFH-1) strain. The aim of this study was to investigate the molecular mechanisms of HCC. The microarray data GSE20948 includes 14 JFH-1- and 14 mock (equal volume of medium [control])-infected Huh7 samples. The data were downloaded from the Gene Expression Omnibus. After data processing, soft cluster analyses were performed to identify co-regulated genes with similar temporal expression patterns. Functional and pathway enrichment analyses, as well as functional annotation analysis, were performed. Subsequently, combined networks of protein-protein interaction network, microRNA regulatory network, and transcriptional regulatory network were constructed. Hub nodes, modules, and five clusters of co-regulated genes were also identified. In total, 173 up and 207 down co-regulated genes were separately identified in JFH-1-infected Huh7 cells compared with those of control cells. Functional enrichment analysis indicated that up co-regulated genes were related to skeletal system morphogenesis and neuron differentiation and down co-regulated genes were related to steroid/cholesterol/sterol metabolisms. Hub genes (such as IRF1, GBP1, ICAM1, Foxa1, DHCR7, HMGCS2, and MSMO1) were identified. Transcription factors IRF1 and Foxa1 were the targets of miR-130a, miR-17-5p, and miR-20a. PPARGC1A was targeted by miR-29 family, and MSMO1 was the target of miR-23 family. Hub nodes (such as IRF1, GBP1, ICAM1, Foxa1, DHCR7, HMGCS2, and MSMO1) and microRNAs might be used as candidate biomarkers of JFH-1-infected HCC. PMID:26811688

  18. Bioinformatics analysis of time-series genes profiling to explore key genes affected by age in fracture healing.

    PubMed

    Wang, Wei; Shen, Hao; Xie, Jingjing; Zhou, Qiang; Chen, Yu; Lu, Hua

    2014-06-01

    The present study was aimed to explore possible key genes and bioprocess affected by age during fracture healing. GSE589, GSE592 and GSE1371 were downloaded from gene expression omnibus database. The time-series genes of three age levels rats were firstly identified with hclust function in R. Then functional and pathway enrichment analysis for selected time-series genes were performed. Finally, the VennDiagram package of R language was used to screen overlapping n time-series genes. The expression changes of time-series genes in the rats of three age levels were classified into two types: one was higher expressed at 0 day, decreased at 3 day to 2 week, and increased from 4 to 6 week; the other was the opposite. Functional and pathways enrichment analysis showed that 12 time-series genes of adult and old rats were significantly involved in ECM-receptor interaction pathway. The expression changes of 11 genes were consistent with time axis, 10 genes were up-regulated at 3 days after fracture, and increased slowly in 6 week, while Itga2b was down-regulated. The functions of 106 overlapping genes were all associated with growth and development of bone after fracture. The key genes in ECM-receptor interaction pathway including Spp1, Ibsp, Tnn and Col3a1 have been reported to be related to fracture in literatures. The difference during fracture healing in three age levels rats is mainly related to age. The Spp1, Ibsp, Tnn and Col3a1 are possible potential age-related genes and ECM-receptor interaction pathway is the potential age-related process during fracture healing. PMID:24627361

  19. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  20. An active registry for bioinformatics web services

    PubMed Central

    Pettifer, S.; Thorne, D.; McDermott, P.; Attwood, T.; Baran, J.; Bryne, J. C.; Hupponen, T.; Mowbray, D.; Vriend, G.

    2009-01-01

    Summary: The EMBRACE Registry is a web portal that collects and monitors web services according to test scripts provided by the their administrators. Users are able to search for, rank and annotate services, enabling them to select the most appropriate working service for inclusion in their bioinformatics analysis tasks. Availability and implementation: Web site implemented with PHP, Python, MySQL and Apache, with all major browsers supported. (www.embraceregistry.net) Contact: steve.pettifer@manchester.ac.uk PMID:19460889

  1. ANALYSIS OF MPC ACCESS REQUIREMENTS FOR ADDITION OF FILLER MATERIALS

    SciTech Connect

    W. Wallin

    1996-09-03

    This analysis is prepared by the Mined Geologic Disposal System (MGDS) Waste Package Development Department (WPDD) in response to a request received via a QAP-3-12 Design Input Data Request (Ref. 5.1) from WAST Design (formerly MRSMPC Design). The request is to provide: Specific MPC access requirements for the addition of filler materials at the MGDS (i.e., location and size of access required). The objective of this analysis is to provide a response to the foregoing request. The purpose of this analysis is to provide a documented record of the basis for the response. The response is stated in Section 8 herein. The response is based upon requirements from an MGDS perspective.

  2. Screening feature genes of astrocytoma using a combined method of microarray gene expression profiling and bioinformatics analysis

    PubMed Central

    Cai, Yong; Zhong, Xingming; Wang, Yiqi; Yang, Jianguo

    2015-01-01

    The aim of our study was to find feature genes associated with astrocytoma and correlative gene functions which can distinguish cancer tissue from adjacent non-tumor astrocyte tissues. Gene expression profile GSE15824 was downloaded from Gene Expression Omnibus database which included 8 astrocytoma tissues and 3 adjacent non-tumor astrocyte samples. The raw data were first transformed into probe-level data and the differentially expressed genes (DEGs) between tissues of patients with astrocytoma and normal specimen were identified using T-test in samr package of R. The Database for Annotation, Visualization and Integrated Discovery (DAVID) was applied to analyze the gene ontology (GO) enrichment on gene functions and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Finally, corresponding protein-protein interaction (PPI) networks of DEGs was constructed using the Cytoscape based on the data collected from STRING online datasets. A total of 3072 genes, including 1799 up-regulated genes and 1273 down-regulated genes, were filtered as DEGs, and we learnt that the DEGs including AQP4, PMP2, SRARCL1 and SLC1A2CAMs etc and that AQP4 was most significantly related to cell osmotic pressure. Three feature genes in KEGG pathway are highly enriched in cancer specimen while two genes are in the normal tissues. The discovery of featured genes significantly related to the regulation of cell osmotic pressure, has the potential to use in clinic for diagnosis of astrocytoma in future. In addition, it has a great significance on studying mechanism, distinguishing normal and cancer tissues, and exploring new treatments for astrocytoma. However, further experiments were needed to confirm our result. PMID:26770395

  3. Screening feature genes of astrocytoma using a combined method of microarray gene expression profiling and bioinformatics analysis.

    PubMed

    Cai, Yong; Zhong, Xingming; Wang, Yiqi; Yang, Jianguo

    2015-01-01

    The aim of our study was to find feature genes associated with astrocytoma and correlative gene functions which can distinguish cancer tissue from adjacent non-tumor astrocyte tissues. Gene expression profile GSE15824 was downloaded from Gene Expression Omnibus database which included 8 astrocytoma tissues and 3 adjacent non-tumor astrocyte samples. The raw data were first transformed into probe-level data and the differentially expressed genes (DEGs) between tissues of patients with astrocytoma and normal specimen were identified using T-test in samr package of R. The Database for Annotation, Visualization and Integrated Discovery (DAVID) was applied to analyze the gene ontology (GO) enrichment on gene functions and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Finally, corresponding protein-protein interaction (PPI) networks of DEGs was constructed using the Cytoscape based on the data collected from STRING online datasets. A total of 3072 genes, including 1799 up-regulated genes and 1273 down-regulated genes, were filtered as DEGs, and we learnt that the DEGs including AQP4, PMP2, SRARCL1 and SLC1A2CAMs etc and that AQP4 was most significantly related to cell osmotic pressure. Three feature genes in KEGG pathway are highly enriched in cancer specimen while two genes are in the normal tissues. The discovery of featured genes significantly related to the regulation of cell osmotic pressure, has the potential to use in clinic for diagnosis of astrocytoma in future. In addition, it has a great significance on studying mechanism, distinguishing normal and cancer tissues, and exploring new treatments for astrocytoma. However, further experiments were needed to confirm our result. PMID:26770395

  4. Identification of microRNAs in the Toxigenic Dinoflagellate Alexandrium catenella by High-Throughput Illumina Sequencing and Bioinformatic Analysis

    PubMed Central

    Geng, Huili; Sui, Zhenghong; Zhang, Shu; Du, Qingwei; Ren, Yuanyuan; Liu, Yuan; Kong, Fanna; Zhong, Jie; Ma, Qingxia

    2015-01-01

    Micro-ribonucleic acids (miRNAs) are a large group of endogenous, tiny, non-coding RNAs consisting of 19–25 nucleotides that regulate gene expression at either the transcriptional or post-transcriptional level by mediating gene silencing in eukaryotes. They are considered to be important regulators that affect growth, development, and response to various stresses in plants. Alexandrium catenella is an important marine toxic phytoplankton species that can cause harmful algal blooms (HABs). To date, identification and function analysis of miRNAs in A. catenella remain largely unexamined. In this study, high-throughput sequencing was performed on A. catenella to identify and quantitatively profile the repertoire of small RNAs from two different growth phases. A total of 38,092,056 and 32,969,156 raw reads were obtained from the two small RNA libraries, respectively. In total, 88 mature miRNAs belonging to 32 miRNA families were identified. Significant differences were found in the member number, expression level of various families, and expression abundance of each member within a family. A total of 15 potentially novel miRNAs were identified. Comparative profiling showed that 12 known miRNAs exhibited differential expression between the lag phase and the logarithmic phase. Real-time quantitative RT-PCR (qPCR) was performed to confirm the expression of two differentially expressed miRNAs that were one up-regulated novel miRNA (aca-miR-3p-456915), and one down-regulated conserved miRNA (tae-miR159a). The expression trend of the qPCR assay was generally consistent with the deep sequencing result. Target predictions of the 12 differentially expressed miRNAs resulted in 1813target genes. Gene ontology (GO) analysis and the Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG) annotations revealed that some miRNAs were associated with growth and developmental processes of the alga. These results provide insights into the roles that miRNAs play in the growth of

  5. Identification of microRNAs in the Toxigenic Dinoflagellate Alexandrium catenella by High-Throughput Illumina Sequencing and Bioinformatic Analysis.

    PubMed

    Geng, Huili; Sui, Zhenghong; Zhang, Shu; Du, Qingwei; Ren, Yuanyuan; Liu, Yuan; Kong, Fanna; Zhong, Jie; Ma, Qingxia

    2015-01-01

    Micro-ribonucleic acids (miRNAs) are a large group of endogenous, tiny, non-coding RNAs consisting of 19-25 nucleotides that regulate gene expression at either the transcriptional or post-transcriptional level by mediating gene silencing in eukaryotes. They are considered to be important regulators that affect growth, development, and response to various stresses in plants. Alexandrium catenella is an important marine toxic phytoplankton species that can cause harmful algal blooms (HABs). To date, identification and function analysis of miRNAs in A. catenella remain largely unexamined. In this study, high-throughput sequencing was performed on A. catenella to identify and quantitatively profile the repertoire of small RNAs from two different growth phases. A total of 38,092,056 and 32,969,156 raw reads were obtained from the two small RNA libraries, respectively. In total, 88 mature miRNAs belonging to 32 miRNA families were identified. Significant differences were found in the member number, expression level of various families, and expression abundance of each member within a family. A total of 15 potentially novel miRNAs were identified. Comparative profiling showed that 12 known miRNAs exhibited differential expression between the lag phase and the logarithmic phase. Real-time quantitative RT-PCR (qPCR) was performed to confirm the expression of two differentially expressed miRNAs that were one up-regulated novel miRNA (aca-miR-3p-456915), and one down-regulated conserved miRNA (tae-miR159a). The expression trend of the qPCR assay was generally consistent with the deep sequencing result. Target predictions of the 12 differentially expressed miRNAs resulted in 1813 target genes. Gene ontology (GO) analysis and the Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG) annotations revealed that some miRNAs were associated with growth and developmental processes of the alga. These results provide insights into the roles that miRNAs play in the growth of A

  6. Expressional and Bioinformatic Analysis of Bovine Filia/Ecat1/Khdc3l Gene: A Comparison with Ovine Species.

    PubMed

    Zahmatkesh, Azadeh; Ansari Mahyari, Saeid; Daliri Joupari, Morteza; Rahmani, Hamidreza; Shirazi, Abolfazl; Amiri Roudbar, Mahmood; Ansari Majd, Saeid

    2016-01-01

    Maternal effect genes have highly impressive effects on pre-implantation development. Filia/Ecat1/Khdc3l is a maternal effect gene found in mouse oocytes and embryos, loss of which causes a 50% decrease in fertility. In the present study, we investigated Filia mRNA expression in bovine oviduct, 30- to 40-day fetus, liver, heart, lung, and oocytes (as a positive control), by RT-PCR and detected it only in oocytes. A 443 bp fragment was amplified only in oocytes and was sequenced as a part of bovine predicted Filia mRNA. We analyzed bovine and ovine Filia N-terminal peptide sequence in PHYRE2, and a KH domain was predicted. Protein alignment using ClustalW indicated a highly identical N-terminal extention between the 2 species. Immunohistochemical analysis using anti-bovine Filia antibody showed the expression of Filia protein in the zone surrounding the nuclear membrane, and in the subcortex of ovine oocytes of primary and antral follicles. However, in the bovine, Filia has been found through the oocyte cytoplasm of antral follicles, and here it is further confirmed in the primary follicles. Our data suggests a difference in Filia expression pattern between cow and sheep, although the sequence is highly conserved. PMID:27070240

  7. Integrative bioinformatics analysis of genomic and proteomic approaches to understand the transcriptional regulatory program in coronary artery disease pathways.

    PubMed

    Vangala, Rajani Kanth; Ravindran, Vandana; Ghatge, Madan; Shanker, Jayashree; Arvind, Prathima; Bindu, Hima; Shekar, Meghala; Rao, Veena S

    2013-01-01

    Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.

  8. Identification of a novel carbohydrate esterase from Bjerkandera adusta: structural and function predictions through bioinformatics analysis and molecular modeling.

    PubMed

    Cuervo-Soto, Laura I; Valdés-García, Gilberto; Batista-García, Ramón; del Rayo Sánchez-Carbente, María; Balcázar-López, Edgar; Lira-Ruan, Verónica; Pastor, Nina; Folch-Mallol, Jorge Luis

    2015-03-01

    A new gene from Bjerkandera adusta strain UAMH 8258 encoding a carbohydrate esterase (designated as BacesI) was isolated and expressed in Pichia pastoris. The gene had an open reading frame of 1410 bp encoding a polypeptide of 470 amino acid residues, the first 18 serving as a secretion signal peptide. Homology and phylogenetic analyses showed that BaCesI belongs to carbohydrate esterases family 4. Three-dimensional modeling of the protein and normal mode analysis revealed a breathing mode of the active site that could be relevant for esterase activity. Furthermore, the overall negative electrostatic potential of this enzyme suggests that it degrades neutral substrates and will not act on negative substrates such as peptidoglycan or p-nitrophenol derivatives. The enzyme shows a specific activity of 1.118 U mg(-1) protein on 2-naphthyl acetate. No activity was detected on p-nitrophenol derivatives as proposed from the electrostatic potential data. The deacetylation activity of the recombinant BaCesI was confirmed by measuring the release of acetic acid from several substrates, including oat xylan, shrimp shell chitin, N-acetylglucosamine, and natural substrates such as sugar cane bagasse and grass. This makes the protein very interesting for the biofuels production industry from lignocellulosic materials and for the production of chitosan from chitin.

  9. Bioinformatics and Molecular Analysis of the Evolutionary Relationship between Bovine Rhinitis A Viruses and Foot-And-Mouth Disease Virus

    PubMed Central

    Rai, Devendra K.; Lawrence, Paul; Pauszek, Steve J.; Piccone, Maria E.; Knowles, Nick J.; Rieder, Elizabeth

    2015-01-01

    Bovine rhinitis viruses (BRVs) cause mild respiratory disease of cattle. In this study, a near full-length genome sequence of a virus named RS3X (formerly classified as bovine rhinovirus type 1), isolated from infected cattle from the UK in the 1960s, was obtained and analyzed. Compared to other closely related Aphthoviruses, major differences were detected in the leader protease (Lpro), P1, 2B, and 3A proteins. Phylogenetic analysis revealed that RS3X was a member of the species bovine rhinitis A virus (BRAV). Using different codon-based and branch-site selection models for Aphthoviruses, including BRAV RS3X and foot-and-mouth disease virus, we observed no clear evidence for genomic regions undergoing positive selection. However, within each of the BRV species, multiple sites under positive selection were detected. The results also suggest that the probability (determined by Recombination Detection Program) for recombination events between BRVs and other Aphthoviruses, including foot-and-mouth disease virus was not significant. In contrast, within BRVs, the probability of recombination increases. The data reported here provide genetic information to assist in the identification of diagnostic signatures and research tools for BRAV. PMID:27081310

  10. Identification of a novel carbohydrate esterase from Bjerkandera adusta: structural and function predictions through bioinformatics analysis and molecular modeling.

    PubMed

    Cuervo-Soto, Laura I; Valdés-García, Gilberto; Batista-García, Ramón; del Rayo Sánchez-Carbente, María; Balcázar-López, Edgar; Lira-Ruan, Verónica; Pastor, Nina; Folch-Mallol, Jorge Luis

    2015-03-01

    A new gene from Bjerkandera adusta strain UAMH 8258 encoding a carbohydrate esterase (designated as BacesI) was isolated and expressed in Pichia pastoris. The gene had an open reading frame of 1410 bp encoding a polypeptide of 470 amino acid residues, the first 18 serving as a secretion signal peptide. Homology and phylogenetic analyses showed that BaCesI belongs to carbohydrate esterases family 4. Three-dimensional modeling of the protein and normal mode analysis revealed a breathing mode of the active site that could be relevant for esterase activity. Furthermore, the overall negative electrostatic potential of this enzyme suggests that it degrades neutral substrates and will not act on negative substrates such as peptidoglycan or p-nitrophenol derivatives. The enzyme shows a specific activity of 1.118 U mg(-1) protein on 2-naphthyl acetate. No activity was detected on p-nitrophenol derivatives as proposed from the electrostatic potential data. The deacetylation activity of the recombinant BaCesI was confirmed by measuring the release of acetic acid from several substrates, including oat xylan, shrimp shell chitin, N-acetylglucosamine, and natural substrates such as sugar cane bagasse and grass. This makes the protein very interesting for the biofuels production industry from lignocellulosic materials and for the production of chitosan from chitin. PMID:25586442

  11. microRNA expression profiling and bioinformatic analysis of dengue virus‑infected peripheral blood mononuclear cells.

    PubMed

    Qi, Yiming; Li, Ying; Zhang, Lin; Huang, Junqi

    2013-03-01

    Dengue virus (DENV) causes self‑limiting dengue fever (DF), severe dengue hemorrhagic fever (DHF) and dengue shock syndrome (DSS). It is generally considered that cytokine storm leads to the increased plasma leakage characteristic of DHF/DSS. In the present study, peripheral blood mononuclear cells (PBMCs) were isolated from blood samples of healthy volunteers and infected with DENV serotype 2 (DENV2). Culture supernatants of DENV2‑infected and -uninfected PBMCs were analyzed using a human cytokine array. Between a 6‑12 h post‑infection, levels of CCL5, IL‑6 and IL‑8 were markedly elevated, while those of TNF‑α decreased. Total RNA isolated from these PBMCs was analyzed by human miRNA microarray to identify differentially expressed microRNAs (miRNAs). Quantitative reverse transcription polymerase chain reaction was used to validate 11 upregulated and 4 downregulated miRNAs. Sanger mibase, miRanda and TargetScan were used to identify 261 common predicted genes. Databases were used to identify homologous sequences on mRNAs of putative target genes that may be directly bound by the miRNAs identified. We found that cytokines and epigenetic regulators may be putative target genes of these miRNAs. Using ingenuity pathway analysis, we noted that canonical pathways, including biological regulation, may be modulated by these miRNAs.

  12. Bioinformatics Approach in Plant Genomic Research.

    PubMed

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  13. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    SciTech Connect

    Mefford, Megan E.; Kunstman, Kevin; Wolinsky, Steven M.; Gabuzda, Dana

    2015-07-15

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions.

  14. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    PubMed Central

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  15. Functional and Bioinformatics Analysis of Two Campylobacter jejuni Homologs of the Thiol-Disulfide Oxidoreductase, DsbA

    PubMed Central

    Grabowska, Anna D.; Wywiał, Ewa; Dunin-Horkawicz, Stanislaw; Łasica, Anna M.; Wösten, Marc M. S. M.; Nagy-Staroń, Anna; Godlewska, Renata; Bocian-Ostrzycka, Katarzyna; Pieńkowska, Katarzyna; Łaniewski, Paweł; Bujnicki, Janusz M.; van Putten, Jos P. M.; Jagusztyn-Krynicka, E. Katarzyna

    2014-01-01

    Background Bacterial Dsb enzymes are involved in the oxidative folding of many proteins, through the formation of disulfide bonds between their cysteine residues. The Dsb protein network has been well characterized in cells of the model microorganism Escherichia coli. To gain insight into the functioning of the Dsb system in epsilon-Proteobacteria, where it plays an important role in the colonization process, we studied two homologs of the main Escherichia coli Dsb oxidase (EcDsbA) that are present in the cells of the enteric pathogen Campylobacter jejuni, the most frequently reported bacterial cause of human enteritis in the world. Methods and Results Phylogenetic analysis suggests the horizontal transfer of the epsilon-Proteobacterial DsbAs from a common ancestor to gamma-Proteobacteria, which then gave rise to the DsbL lineage. Phenotype and enzymatic assays suggest that the two C. jejuni DsbAs play different roles in bacterial cells and have divergent substrate spectra. CjDsbA1 is essential for the motility and autoagglutination phenotypes, while CjDsbA2 has no impact on those processes. CjDsbA1 plays a critical role in the oxidative folding that ensures the activity of alkaline phosphatase CjPhoX, whereas CjDsbA2 is crucial for the activity of arylsulfotransferase CjAstA, encoded within the dsbA2-dsbB-astA operon. Conclusions Our results show that CjDsbA1 is the primary thiol-oxidoreductase affecting life processes associated with bacterial spread and host colonization, as well as ensuring the oxidative folding of particular protein substrates. In contrast, CjDsbA2 activity does not affect the same processes and so far its oxidative folding activity has been demonstrated for one substrate, arylsulfotransferase CjAstA. The results suggest the cooperation between CjDsbA2 and CjDsbB. In the case of the CjDsbA1, this cooperation is not exclusive and there is probably another protein to be identified in C. jejuni cells that acts to re-oxidize CjDsbA1. Altogether

  16. Bioinformatics Analysis of the Complete Genome Sequence of the Mango Tree Pathogen Pseudomonas syringae pv. syringae UMAF0158 Reveals Traits Relevant to Virulence and Epiphytic Lifestyle

    PubMed Central

    Arrebola, Eva; Carrión, Víctor J.; Gutiérrez-Barranquero, José Antonio; Pérez-García, Alejandro; Ramos, Cayo; Cazorla, Francisco M.; de Vicente, Antonio

    2015-01-01

    The genome sequence of more than 100 Pseudomonas syringae strains has been sequenced to date; however only few of them have been fully assembled, including P. syringae pv. syringae B728a. Different strains of pv. syringae cause different diseases and have different host specificities; so, UMAF0158 is a P. syringae pv. syringae strain related to B728a but instead of being a bean pathogen it causes apical necrosis of mango trees, and the two strains belong to different phylotypes of pv.syringae and clades of P. syringae. In this study we report the complete sequence and annotation of P. syringae pv. syringae UMAF0158 chromosome and plasmid pPSS158. A comparative analysis with the available sequenced genomes of other 25 P. syringae strains, both closed (the reference genomes DC3000, 1448A and B728a) and draft genomes was performed. The 5.8 Mb UMAF0158 chromosome has 59.3% GC content and comprises 5017 predicted protein-coding genes. Bioinformatics analysis revealed the presence of genes potentially implicated in the virulence and epiphytic fitness of this strain. We identified several genetic features, which are absent in B728a, that may explain the ability of UMAF0158 to colonize and infect mango trees: the mangotoxin biosynthetic operon mbo, a gene cluster for cellulose production, two different type III and two type VI secretion systems, and a particular T3SS effector repertoire. A mutant strain defective in the rhizobial-like T3SS Rhc showed no differences compared to wild-type during its interaction with host and non-host plants and worms. Here we report the first complete sequence of the chromosome of a pv. syringae strain pathogenic to a woody plant host. Our data also shed light on the genetic factors that possibly determine the pathogenic and epiphytic lifestyle of UMAF0158. This work provides the basis for further analysis on specific mechanisms that enable this strain to infect woody plants and for the functional analysis of host specificity in the P

  17. Bioinformatics resource manager v2.3: an integrated software environment for systems biology with microRNA and cross-species analysis tools

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are noncoding RNAs that direct post-transcriptional regulation of protein coding genes. Recent studies have shown miRNAs are important for controlling many biological processes, including nervous system development, and are highly conserved across species. Given their importance, computational tools are necessary for analysis, interpretation and integration of high-throughput (HTP) miRNA data in an increasing number of model species. The Bioinformatics Resource Manager (BRM) v2.3 is a software environment for data management, mining, integration and functional annotation of HTP biological data. In this study, we report recent updates to BRM for miRNA data analysis and cross-species comparisons across datasets. Results BRM v2.3 has the capability to query predicted miRNA targets from multiple databases, retrieve potential regulatory miRNAs for known genes, integrate experimentally derived miRNA and mRNA datasets, perform ortholog mapping across species, and retrieve annotation and cross-reference identifiers for an expanded number of species. Here we use BRM to show that developmental exposure of zebrafish to 30 uM nicotine from 6–48 hours post fertilization (hpf) results in behavioral hyperactivity in larval zebrafish and alteration of putative miRNA gene targets in whole embryos at developmental stages that encompass early neurogenesis. We show typical workflows for using BRM to integrate experimental zebrafish miRNA and mRNA microarray datasets with example retrievals for zebrafish, including pathway annotation and mapping to human ortholog. Functional analysis of differentially regulated (p<0.05) gene targets in BRM indicates that nicotine exposure disrupts genes involved in neurogenesis, possibly through misregulation of nicotine-sensitive miRNAs. Conclusions BRM provides the ability to mine complex data for identification of candidate miRNAs or pathways that drive phenotypic outcome and, therefore, is a useful hypothesis

  18. Machine learning: an indispensable tool in bioinformatics.

    PubMed

    Inza, Iñaki; Calvo, Borja; Armañanzas, Rubén; Bengoetxea, Endika; Larrañaga, Pedro; Lozano, José A

    2010-01-01

    The increase in the number and complexity of biological databases has raised the need for modern and powerful data analysis tools and techniques. In order to fulfill these requirements, the machine learning discipline has become an everyday tool in bio-laboratories. The use of machine learning techniques has been extended to a wide spectrum of bioinformatics applications. It is broadly used to investigate the underlying mechanisms and interactions between biological molecules in many diseases, and it is an essential tool in any biomarker discovery process. In this chapter, we provide a basic taxonomy of machine learning algorithms, and the characteristics of main data preprocessing, supervised classification, and clustering techniques are shown. Feature selection, classifier evaluation, and two supervised classification topics that have a deep impact on current bioinformatics are presented. We make the interested reader aware of a set of popular web resources, open source software tools, and benchmarking data repositories that are frequently used by the machine learning community. PMID:19957143

  19. Bioinformatic pipelines in Python with Leaf

    PubMed Central

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315

  20. Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis

    PubMed Central

    2012-01-01

    Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the

  1. Bioinformatics on the Cloud Computing Platform Azure

    PubMed Central

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  2. Bioinformatic analysis of the nucleolus.

    PubMed Central

    Leung, Anthony K L; Andersen, Jens S; Mann, Matthias; Lamond, Angus I

    2003-01-01

    The nucleolus is a plurifunctional, nuclear organelle, which is responsible for ribosome biogenesis and many other functions in eukaryotes, including RNA processing, viral replication and tumour suppression. Our knowledge of the human nucleolar proteome has been expanded dramatically by the two recent MS studies on isolated nucleoli from HeLa cells [Andersen, Lyon, Fox, Leung, Lam, Steen, Mann and Lamond (2002) Curr. Biol. 12, 1-11; Scherl, Coute, Deon, Calle, Kindbeiter, Sanchez, Greco, Hochstrasser and Diaz (2002) Mol. Biol. Cell 13, 4100-4109]. Nearly 400 proteins were identified within the nucleolar proteome so far in humans. Approx. 12% of the identified proteins were previously shown to be nucleolar in human cells and, as expected, nearly all of the known housekeeping proteins required for ribosome biogenesis were identified in these analyses. Surprisingly, approx. 30% represented either novel or uncharacterized proteins. This review focuses on how to apply the derived knowledge of this newly recognized nucleolar proteome, such as their amino acid/peptide composition and their homologies across species, to explore the function and dynamics of the nucleolus, and suggests ways to identify, in silico, possible functions of the novel/uncharacterized proteins and potential interaction networks within the human nucleolus, or between the nucleolus and other nuclear organelles, by drawing resources from the public domain. PMID:14531731

  3. Web services at the European Bioinformatics Institute-2009

    PubMed Central

    Mcwilliam, Hamish; Valentin, Franck; Goujon, Mickael; Li, Weizhong; Narayanasamy, Menaka; Martin, Jenny; Miyar, Teresa; Lopez, Rodrigo

    2009-01-01

    The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition to the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress. These APIs are based on Web Services (SOAP/REST) interfaces that allow users to systematically access databases and analytical tools. From the user's point of view, these Web Services provide the same functionality as the browser-based forms. However, using the APIs frees the user from web page constraints and are ideal for the analysis of large batches of data, performing text-mining tasks and the casual or systematic evaluation of mathematical models in regulatory networks. Furthermore, these services are widespread and easy to use; require no prior knowledge of the technology and no more than basic experience in programming. In the following we wish to inform of new and updated services as well as briefly describe planned developments to be made available during the course of 2009–2010. PMID:19435877

  4. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    PubMed

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease.

  5. A library-based bioinformatics services program*

    PubMed Central

    Yarfitz, Stuart; Ketchell, Debra S.

    2000-01-01

    Support for molecular biology researchers has been limited to traditional library resources and services in most academic health sciences libraries. The University of Washington Health Sciences Libraries have been providing specialized services to this user community since 1995. The library recruited a Ph.D. biologist to assess the molecular biological information needs of researchers and design strategies to enhance library resources and services. A survey of laboratory research groups identified areas of greatest need and led to the development of a three-pronged program: consultation, education, and resource development. Outcomes of this program include bioinformatics consultation services, library-based and graduate level courses, networking of sequence analysis tools, and a biological research Web site. Bioinformatics clients are drawn from diverse departments and include clinical researchers in need of tools that are not readily available outside of basic sciences laboratories. Evaluation and usage statistics indicate that researchers, regardless of departmental affiliation or position, require support to access molecular biology and genetics resources. Centralizing such services in the library is a natural synergy of interests and enhances the provision of traditional library resources. Successful implementation of a library-based bioinformatics program requires both subject-specific and library and information technology expertise. PMID:10658962

  6. Application of Bioinformatics in Chronobiology Research

    PubMed Central

    Lopes, Robson da Silva; Resende, Nathalia Maria; Honorio-França, Adenilda Cristina; França, Eduardo Luzía

    2013-01-01

    Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research. PMID:24187519

  7. Spectroscopic analysis and DFT calculations of a food additive Carmoisine

    NASA Astrophysics Data System (ADS)

    Snehalatha, M.; Ravikumar, C.; Hubert Joe, I.; Sekar, N.; Jayakumar, V. S.

    2009-04-01

    FT-IR and Raman techniques were employed for the vibrational characterization of the food additive Carmoisine (E122). The equilibrium geometry, various bonding features, and harmonic vibrational wavenumbers have been investigated with the help of density functional theory (DFT) calculations. A good correlation was found between the computed and experimental wavenumbers. Azo stretching wavenumbers have been lowered due to conjugation and π-electron delocalization. Predicted electronic absorption spectra from TD-DFT calculation have been analysed comparing with the UV-vis spectrum. The first hyperpolarizability of the molecule is calculated. Intramolecular charge transfer (ICT) responsible for the optical nonlinearity of the dye molecule has been discussed theoretically and experimentally. Stability of the molecule arising from hyperconjugative interactions, charge delocalization and C-H⋯O, improper, blue shifted hydrogen bonds have been analysed using natural bond orbital (NBO) analysis.

  8. The 2015 Bioinformatics Open Source Conference (BOSC 2015)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J. A.; Lapp, Hilmar

    2016-01-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included “Data Science;” “Standards and Interoperability;” “Open Science and Reproducibility;” “Translational Bioinformatics;” “Visualization;” and “Bioinformatics Open Source Project Updates”. In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled “Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community,” that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  9. [Analysis of constituents in urushi wax, a natural food additive].

    PubMed

    Jin, Zhe-Long; Tada, Atsuko; Sugimoto, Naoki; Sato, Kyoko; Masuda, Aino; Yamagata, Kazuo; Yamazaki, Takeshi; Tanamoto, Kenichi

    2006-08-01

    Urushi wax is a natural gum base used as a food additive. In order to evaluate the quality of urushi wax as a food additive and to obtain information useful for setting official standards, we investigated the constituents and their concentrations in urushi wax, using the same sample as scheduled for toxicity testing. After methanolysis of urushi wax, the composition of fatty acids was analyzed by GC/MS. The results indicated that the main fatty acids were palmitic acid, oleic acid and stearic acid. LC/MS analysis of urushi wax provided molecular-related ions of the main constituents. The main constituents were identified as triglycerides, namely glyceryl tripalmitate (30.7%), glyceryl dipalmitate monooleate (21.2%), glyceryl dioleate monopalmitate (2.1%), glyceryl monooleate monopalmitate monostearate (2.6%), glyceryl dipalmitate monostearate (5.6%), glyceryl distearate monopalmitate (1.4%). Glyceryl dipalmitate monooleate isomers differing in the binding sites of each constituent fatty acid could be separately determined by LC/MS/MS. PMID:16984037

  10. Decreasing Cloudiness Over China: An Updated Analysis Examining Additional Variables

    SciTech Connect

    Kaiser, D.P.

    2000-01-14

    As preparation of the IPCC's Third Assessment Report takes place, one of the many observed climate variables of key interest is cloud amount. For several nations of the world, there exist records of surface-observed cloud amount dating back to the middle of the 20th Century or earlier, offering valuable information on variations and trends. Studies using such databases include Sun and Groisman (1999) and Kaiser and Razuvaev (1995) for the former Soviet Union, Angel1 et al. (1984) for the United States, Henderson-Sellers (1986) for Europe, Jones and Henderson-Sellers (1992) for Australia, and Kaiser (1998) for China. The findings of Kaiser (1998) differ from the other studies in that much of China appears to have experienced decreased cloudiness over recent decades (1954-1994), whereas the other land regions for the most part show evidence of increasing cloud cover. This paper expands on Kaiser (1998) by analyzing trends in additional meteorological variables for Chi na [station pressure (p), water vapor pressure (e), and relative humidity (rh)] and extending the total cloud amount (N) analysis an additional two years (through 1996).

  11. Generations of interdisciplinarity in bioinformatics

    PubMed Central

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  12. Sensitivity analysis of geometric errors in additive manufacturing medical models.

    PubMed

    Pinto, Jose Miguel; Arrieta, Cristobal; Andia, Marcelo E; Uribe, Sergio; Ramos-Grez, Jorge; Vargas, Alex; Irarrazaval, Pablo; Tejos, Cristian

    2015-03-01

    Additive manufacturing (AM) models are used in medical applications for surgical planning, prosthesis design and teaching. For these applications, the accuracy of the AM models is essential. Unfortunately, this accuracy is compromised due to errors introduced by each of the building steps: image acquisition, segmentation, triangulation, printing and infiltration. However, the contribution of each step to the final error remains unclear. We performed a sensitivity analysis comparing errors obtained from a reference with those obtained modifying parameters of each building step. Our analysis considered global indexes to evaluate the overall error, and local indexes to show how this error is distributed along the surface of the AM models. Our results show that the standard building process tends to overestimate the AM models, i.e. models are larger than the original structures. They also show that the triangulation resolution and the segmentation threshold are critical factors, and that the errors are concentrated at regions with high curvatures. Errors could be reduced choosing better triangulation and printing resolutions, but there is an important need for modifying some of the standard building processes, particularly the segmentation algorithms.

  13. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    NASA Technical Reports Server (NTRS)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  14. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    PubMed Central

    Obom, Kristina M.; Cummings, Patrick J.

    2007-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation. PMID:23653816

  15. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).

    PubMed

    Sparapani, Rodney A; Logan, Brent R; McCulloch, Robert E; Laud, Purushottam W

    2016-07-20

    Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26854022

  16. Computational intelligence techniques in bioinformatics.

    PubMed

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included. PMID:23891719

  17. Reproducible Bioinformatics Research for Biologists

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  18. Precessing rotating flows with additional shear: Stability analysis

    NASA Astrophysics Data System (ADS)

    Salhi, A.; Cambon, C.

    2009-03-01

    We consider unbounded precessing rotating flows in which vertical or horizontal shear is induced by the interaction between the solid-body rotation (with angular velocity Ω0 ) and the additional “precessing” Coriolis force (with angular velocity -ɛΩ0 ), normal to it. A “weak” shear flow, with rate 2ɛ of the same order of the Poincaré “small” ratio ɛ , is needed for balancing the gyroscopic torque, so that the whole flow satisfies Euler’s equations in the precessing frame (the so-called admissibility conditions). The base flow case with vertical shear (its cross-gradient direction is aligned with the main angular velocity) corresponds to Mahalov’s [Phys. Fluids A 5, 891 (1993)] precessing infinite cylinder base flow (ignoring boundary conditions), while the base flow case with horizontal shear (its cross-gradient direction is normal to both main and precessing angular velocities) corresponds to the unbounded precessing rotating shear flow considered by Kerswell [Geophys. Astrophys. Fluid Dyn. 72, 107 (1993)]. We show that both these base flows satisfy the admissibility conditions and can support disturbances in terms of advected Fourier modes. Because the admissibility conditions cannot select one case with respect to the other, a more physical derivation is sought: Both flows are deduced from Poincaré’s [Bull. Astron. 27, 321 (1910)] basic state of a precessing spheroidal container, in the limit of small ɛ . A Rapid distortion theory (RDT) type of stability analysis is then performed for the previously mentioned disturbances, for both base flows. The stability analysis of the Kerswell base flow, using Floquet’s theory, is recovered, and its counterpart for the Mahalov base flow is presented. Typical growth rates are found to be the same for both flows at very small ɛ , but significant differences are obtained regarding growth rates and widths of instability bands, if larger ɛ values, up to 0.2, are considered. Finally, both flow cases

  19. Hybrid Additive Manufacturing Technologies - An Analysis Regarding Potentials and Applications

    NASA Astrophysics Data System (ADS)

    Merklein, Marion; Junker, Daniel; Schaub, Adam; Neubauer, Franziska

    Imposing the trend of mass customization of lightweight construction in industry, conventional manufacturing processes like forming technology and chipping production are pushed to their limits for economical manufacturing. More flexible processes are needed which were developed by the additive manufacturing technology. This toolless production principle offers a high geometrical freedom and an optimized utilization of the used material. Thus load adjusted lightweight components can be produced in small lot sizes in an economical way. To compensate disadvantages like inadequate accuracy and surface roughness hybrid machines combining additive and subtractive manufacturing are developed. Within this paper the principles of mainly used additive manufacturing processes of metals and their possibility to be integrated into a hybrid production machine are summarized. It is pointed out that in particular the integration of deposition processes into a CNC milling center supposes high potential for manufacturing larger parts with high accuracy. Furthermore the combination of additive and subtractive manufacturing allows the production of ready to use products within one single machine. Additionally actual research for the integration of additive manufacturing processes into the production chain will be analyzed. For the long manufacturing time of additive production processes the combination with conventional manufacturing processes like sheet or bulk metal forming seems an effective solution. Especially large volumes can be produced by conventional processes. In an additional production step active elements can be applied by additive manufacturing. This principle is also investigated for tool production to reduce chipping of the high strength material used for forming tools. The aim is the addition of active elements onto a geometrical simple basis by using Laser Metal Deposition. That process allows the utilization of several powder materials during one process what

  20. [Bioinformatics-based Design of Peptide Vaccine Candidates Targeting Spike Protein of MERS-CoV and Immunity analysis in Mice].

    PubMed

    Lan, Jiaming; Lu, Shuai; Deng, Yao; Wen, Bo; Chen, Hong; Wang, Wen; Tan, Wenjie

    2016-01-01

    Middle East respiratory syndrome coronavirus (MERS-CoV) was identified as a novel human coronavirus and posed great threat to public health world wide,which calls for the development of effective and safe vaccine urgently. In the study, peptide epitopes tagrgeting spike antigen were predicted based on bioinformatics methods. Nine polypeptides with high scores were synthesized and linked to keyhole limpet hemocyanin (KLH). Female BALB/C mice were immunized with individual polypeptide-KLH, and the total IgG was detected by ELISA as well as the cellular mediated immunity (CMI) was analyzed using ELIs-pot assay. The results showed that an individual peptide of YVDVGPDSVKSACIEVDIQQTFFDKTWPRPIDVSKADGI could induce the highest level of total IgG as well as CMI (high frequency of IFN-γ secretion) against MERS-CoV antigen in mice. Our study identified a promising peptide vaccine candidate against MERS-CoV and provided an experimental support for bioinformatics-based design of peptide vaccine.

  1. [Bioinformatics-based Design of Peptide Vaccine Candidates Targeting Spike Protein of MERS-CoV and Immunity analysis in Mice].

    PubMed

    Lan, Jiaming; Lu, Shuai; Deng, Yao; Wen, Bo; Chen, Hong; Wang, Wen; Tan, Wenjie

    2016-01-01

    Middle East respiratory syndrome coronavirus (MERS-CoV) was identified as a novel human coronavirus and posed great threat to public health world wide,which calls for the development of effective and safe vaccine urgently. In the study, peptide epitopes tagrgeting spike antigen were predicted based on bioinformatics methods. Nine polypeptides with high scores were synthesized and linked to keyhole limpet hemocyanin (KLH). Female BALB/C mice were immunized with individual polypeptide-KLH, and the total IgG was detected by ELISA as well as the cellular mediated immunity (CMI) was analyzed using ELIs-pot assay. The results showed that an individual peptide of YVDVGPDSVKSACIEVDIQQTFFDKTWPRPIDVSKADGI could induce the highest level of total IgG as well as CMI (high frequency of IFN-γ secretion) against MERS-CoV antigen in mice. Our study identified a promising peptide vaccine candidate against MERS-CoV and provided an experimental support for bioinformatics-based design of peptide vaccine. PMID:27295887

  2. Receptor-binding sites: bioinformatic approaches.

    PubMed

    Flower, Darren R

    2006-01-01

    It is increasingly clear that both transient and long-lasting interactions between biomacromolecules and their molecular partners are the most fundamental of all biological mechanisms and lie at the conceptual heart of protein function. In particular, the protein-binding site is the most fascinating and important mechanistic arbiter of protein function. In this review, I examine the nature of protein-binding sites found in both ligand-binding receptors and substrate-binding enzymes. I highlight two important concepts underlying the identification and analysis of binding sites. The first is based on knowledge: when one knows the location of a binding site in one protein, one can "inherit" the site from one protein to another. The second approach involves the a priori prediction of a binding site from a sequence or a structure. The full and complete analysis of binding sites will necessarily involve the full range of informatic techniques ranging from sequence-based bioinformatic analysis through structural bioinformatics to computational chemistry and molecular physics. Integration of both diverse experimental and diverse theoretical approaches is thus a mandatory requirement in the evaluation of binding sites and the binding events that occur within them. PMID:16671408

  3. An agent-based multilayer architecture for bioinformatics grids.

    PubMed

    Bartocci, Ezio; Cacciagrano, Diletta; Cannata, Nicola; Corradini, Flavio; Merelli, Emanuela; Milanesi, Luciano; Romano, Paolo

    2007-06-01

    Due to the huge volume and complexity of biological data available today, a fundamental component of biomedical research is now in silico analysis. This includes modelling and simulation of biological systems and processes, as well as automated bioinformatics analysis of high-throughput data. The quest for bioinformatics resources (including databases, tools, and knowledge) becomes therefore of extreme importance. Bioinformatics itself is in rapid evolution and dedicated Grid cyberinfrastructures already offer easier access and sharing of resources. Furthermore, the concept of the Grid is progressively interleaving with those of Web Services, semantics, and software agents. Agent-based systems can play a key role in learning, planning, interaction, and coordination. Agents constitute also a natural paradigm to engineer simulations of complex systems like the molecular ones. We present here an agent-based, multilayer architecture for bioinformatics Grids. It is intended to support both the execution of complex in silico experiments and the simulation of biological systems. In the architecture a pivotal role is assigned to an "alive" semantic index of resources, which is also expected to facilitate users' awareness of the bioinformatics domain.

  4. GISH analysis of disomic Brassica napus-Crambe abyssinica chromosome addition lines produced by microspore culture from monosomic addition lines.

    PubMed

    Wang, Youping; Sonntag, Karin; Rudloff, Eicke; Wehling, Peter; Snowdon, Rod J

    2006-02-01

    Two Brassica napus-Crambe abyssinica monosomic addition lines (2n=39, AACC plus a single chromosome from C. abyssinca) were obtained from the F(2) progeny of the asymmetric somatic hybrid. The alien chromosome from C. abyssinca in the addition line was clearly distinguished by genomic in situ hybridization (GISH). Twenty-seven microspore-derived plants from the addition lines were obtained. Fourteen seedlings were determined to be diploid plants (2n=38) arising from spontaneous chromosome doubling, while 13 seedlings were confirmed as haploid plants. Doubled haploid plants produced after treatment with colchicine and two disomic chromosome addition lines (2n=40, AACC plus a single pair of homologous chromosomes from C. abyssinca) could again be identified by GISH analysis. The lines are potentially useful for molecular genetic analysis of novel C. abyssinica genes or alleles contributing to traits relevant for oilseed rape (B. napus) breeding.

  5. USDA Stakeholder Workshop on Animal Bioinformatics: Summary and Recommendations.

    PubMed

    Hamernik, Debora L; Adelson, David L

    2003-01-01

    An electronic workshop was conducted on 4 November-13 December 2002 to discuss current issues and needs in animal bioinformatics. The electronic (e-mail listserver) format was chosen to provide a relatively speedy process that is broad in scope, cost-efficient and easily accessible to all participants. Approximately 40 panelists with diverse species and discipline expertise communicated through the panel e-mail listserver. The panel included scientists from academia, industry and government, in the USA, Australia and the UK. A second 'stakeholder' e-mail listserver was used to obtain input from a broad audience with general interests in animal genomics. The objectives of the electronic workshop were: (a) to define priorities for animal genome database development; and (b) to recommend ways in which the USDA could provide leadership in the area of animal genome database development. E-mail messages from panelists and stakeholders are archived at http://genome.cvm.umn.edu/bioinfo/. Priorities defined for animal genome database development included: (a) data repository; (b) tools for genome analysis; (c) annotation; (d) practical application of genomic data; and (e) a biological framework for DNA sequence. A stable source of funding, such as the USDA Agricultural Research Service (ARS), was recommended to support maintenance of data repositories and data curation. Continued support for competitive grants programs within the USDA Cooperative State Research, Education and Extension Service (CSREES) was recommended for tool development and hypothesis-driven research projects in genome analysis. Additional stakeholder input will be required to continuously refine priorities and maximize the use of limited resources for animal bioinformatics within the USDA. PMID:18629125

  6. Embracing the Future: Bioinformatics for High School Women

    NASA Astrophysics Data System (ADS)

    Zales, Charlotte Rappe; Cronin, Susan J.

    Sixteen high school women participated in a 5-week residential summer program designed to encourage female and minority students to choose careers in scientific fields. Students gained expertise in bioinformatics through problem-based learning in a complex learning environment of content instruction, speakers, labs, and trips. Innovative hands-on activities filled the program. Students learned biological principles in context and sophisticated bioinformatics tools for processing data. Students additionally mastered a variety of information-searching techniques. Students completed creative individual and group projects, demonstrating the successful integration of biology, information technology, and bioinformatics. Discussions with female scientists allowed students to see themselves in similar roles. Summer residential aspects fostered an atmosphere in which students matured in interacting with others and in their views of diversity.

  7. Protein bioinformatics applied to virology.

    PubMed

    Mohabatkar, Hassan; Keyhanfar, Mehrnaz; Behbahani, Mandana

    2012-09-01

    Scientists have united in a common search to sequence, store and analyze genes and proteins. In this regard, rapidly evolving bioinformatics methods are providing valuable information on these newly-discovered molecules. Understanding what has been done and what we can do in silico is essential in designing new experiments. The unbalanced situation between sequence-known proteins and attribute-known proteins, has called for developing computational methods or high-throughput automated tools for fast and reliably predicting or identifying various characteristics of uncharacterized proteins. Taking into consideration the role of viruses in causing diseases and their use in biotechnology, the present review describes the application of protein bioinformatics in virology. Therefore, a number of important features of viral proteins like epitope prediction, protein docking, subcellular localization, viral protease cleavage sites and computer based comparison of their aspects have been discussed. This paper also describes several tools, principally developed for viral bioinformatics. Prediction of viral protein features and learning the advances in this field can help basic understanding of the relationship between a virus and its host.

  8. ExPASy: SIB bioinformatics resource portal.

    PubMed

    Artimo, Panu; Jonnalagedda, Manohar; Arnold, Konstantin; Baratin, Delphine; Csardi, Gabor; de Castro, Edouard; Duvaud, Séverine; Flegel, Volker; Fortier, Arnaud; Gasteiger, Elisabeth; Grosdidier, Aurélien; Hernandez, Céline; Ioannidis, Vassilios; Kuznetsov, Dmitry; Liechti, Robin; Moretti, Sébastien; Mostaguir, Khaled; Redaschi, Nicole; Rossier, Grégoire; Xenarios, Ioannis; Stockinger, Heinz

    2012-07-01

    ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a 'decentralized' way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across 'selected' resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy.

  9. Prokaryotic Expression, Identification and Bioinformatics Analysis of the Mycobacterium tuberculosis Rv3807c Gene Encoding the Putative Enzyme Committed to Decaprenylphosphoryl-d-arabinose Synthesis.

    PubMed

    Cai, Lina; Zhao, Xiaojiao; Jiang, Tao; Qiu, Juanjuan; Owusu, Lawrence; Ma, Yufang; Wang, Bo; Xin, Yi

    2014-03-01

    Decaprenylphosphoryl-d-arabinofuranosyl (DPA), the immediate donor for the polymerized d-Araf residues of mycobacterial arabinan, is synthesized from 5-phosphoribose-1-diphosphate (PRPP) in three-step reactions. (i) PRPP is transferred to decaprenyl-phosphate (DP) to form decaprenylphosphoryl-d-5-phosphoribose (DPPR). (ii) DPPR is dephosphorylated to form decaprenylphosphoryl-d-ribose (DPR). (iii) DPR is formed to DPA by the epimerase. Mycobacterium tuberculosis Rv3806c and heteromeric Rv3790/Rv3791 have been identified as the PRPP: decaprenyl-phosphate 5-phosphoribosyltransferase and the epimerase respectively. Rv3807c, however, as the candidate of phospholipid phosphatase, catalyzing the biosynthesis of decapreny-l-phosphoryl-ribose (DPR) from decaprenylphosphoryl-β-d-5-phosphoribose by dephosphorylating, has no direct experimental evidence of its essentiality in any species of mycobacterium. In this study, Rv3807c gene was amplified from the genome of M. tuberculosis H37Rv by PCR, and was successfully expressed in Escherichia coli BL21 (DE3) via the recombinant plasmid pColdII-Rv3807c. The resulting protein with the 6× His-tag was identified by SDS-PAGE and Western blotting. The protein was predicted through bioinformatics to contain three transmembrane domains, the N-terminal peptide, and a core structure with phosphatidic acid phosphatase type2/haloperoxidase. This study provides biochemical and bioinformatics evidence for the importance of Rv3807c in mycobacteria, and further functional studies will be conducted for validating Rv3807c as a promising phospholipid phosphatase in the synthetic pathway of DPA.

  10. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    NASA Astrophysics Data System (ADS)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  11. Expanding roles in a library-based bioinformatics service program: a case study

    PubMed Central

    Li, Meng; Chen, Yi-Bu; Clintworth, William A

    2013-01-01

    Question: How can a library-based bioinformatics support program be implemented and expanded to continuously support the growing and changing needs of the research community? Setting: A program at a health sciences library serving a large academic medical center with a strong research focus is described. Methods: The bioinformatics service program was established at the Norris Medical Library in 2005. As part of program development, the library assessed users' bioinformatics needs, acquired additional funds, established and expanded service offerings, and explored additional roles in promoting on-campus collaboration. Results: Personnel and software have increased along with the number of registered software users and use of the provided services. Conclusion: With strategic efforts and persistent advocacy within the broader university environment, library-based bioinformatics service programs can become a key part of an institution's comprehensive solution to researchers' ever-increasing bioinformatics needs. PMID:24163602

  12. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    PubMed

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  13. Genomics and bioinformatics resources for translational science in Rosaceae.

    PubMed

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  14. Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control.

    PubMed

    Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M

    2014-01-01

    Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part's porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented.

  15. Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control.

    PubMed

    Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M

    2014-01-01

    Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part's porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented. PMID:26601041

  16. Porosity Measurements and Analysis for Metal Additive Manufacturing Process Control

    PubMed Central

    Slotwinski, John A; Garboczi, Edward J; Hebenstreit, Keith M

    2014-01-01

    Additive manufacturing techniques can produce complex, high-value metal parts, with potential applications as critical metal components such as those found in aerospace engines and as customized biomedical implants. Material porosity in these parts is undesirable for aerospace parts - since porosity could lead to premature failure - and desirable for some biomedical implants - since surface-breaking pores allows for better integration with biological tissue. Changes in a part’s porosity during an additive manufacturing build may also be an indication of an undesired change in the build process. Here, we present efforts to develop an ultrasonic sensor for monitoring changes in the porosity in metal parts during fabrication on a metal powder bed fusion system. The development of well-characterized reference samples, measurements of the porosity of these samples with multiple techniques, and correlation of ultrasonic measurements with the degree of porosity are presented. A proposed sensor design, measurement strategy, and future experimental plans on a metal powder bed fusion system are also presented. PMID:26601041

  17. Additional EIPC Study Analysis: Interim Report on High Priority Topics

    SciTech Connect

    Hadley, Stanton W

    2013-11-01

    Between 2010 and 2012 the Eastern Interconnection Planning Collaborative (EIPC) conducted a major long-term resource and transmission study of the Eastern Interconnection (EI). With guidance from a Stakeholder Steering Committee (SSC) that included representatives from the Eastern Interconnection States Planning Council (EISPC) among others, the project was conducted in two phases. Phase 1 involved a long-term capacity expansion analysis that involved creation of eight major futures plus 72 sensitivities. Three scenarios were selected for more extensive transmission- focused evaluation in Phase 2. Five power flow analyses, nine production cost model runs (including six sensitivities), and three capital cost estimations were developed during this second phase. The results from Phase 1 and 2 provided a wealth of data that could be examined further to address energy-related questions. A list of 13 topics was developed for further analysis; this paper discusses the first five.

  18. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    ERIC Educational Resources Information Center

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  19. Disclosure of hydraulic fracturing fluid chemical additives: analysis of regulations.

    PubMed

    Maule, Alexis L; Makey, Colleen M; Benson, Eugene B; Burrows, Isaac J; Scammell, Madeleine K

    2013-01-01

    Hydraulic fracturing is used to extract natural gas from shale formations. The process involves injecting into the ground fracturing fluids that contain thousands of gallons of chemical additives. Companies are not mandated by federal regulations to disclose the identities or quantities of chemicals used during hydraulic fracturing operations on private or public lands. States have begun to regulate hydraulic fracturing fluids by mandating chemical disclosure. These laws have shortcomings including nondisclosure of proprietary or "trade secret" mixtures, insufficient penalties for reporting inaccurate or incomplete information, and timelines that allow for after-the-fact reporting. These limitations leave lawmakers, regulators, public safety officers, and the public uninformed and ill-prepared to anticipate and respond to possible environmental and human health hazards associated with hydraulic fracturing fluids. We explore hydraulic fracturing exemptions from federal regulations, as well as current and future efforts to mandate chemical disclosure at the federal and state level.

  20. Disclosure of hydraulic fracturing fluid chemical additives: analysis of regulations.

    PubMed

    Maule, Alexis L; Makey, Colleen M; Benson, Eugene B; Burrows, Isaac J; Scammell, Madeleine K

    2013-01-01

    Hydraulic fracturing is used to extract natural gas from shale formations. The process involves injecting into the ground fracturing fluids that contain thousands of gallons of chemical additives. Companies are not mandated by federal regulations to disclose the identities or quantities of chemicals used during hydraulic fracturing operations on private or public lands. States have begun to regulate hydraulic fracturing fluids by mandating chemical disclosure. These laws have shortcomings including nondisclosure of proprietary or "trade secret" mixtures, insufficient penalties for reporting inaccurate or incomplete information, and timelines that allow for after-the-fact reporting. These limitations leave lawmakers, regulators, public safety officers, and the public uninformed and ill-prepared to anticipate and respond to possible environmental and human health hazards associated with hydraulic fracturing fluids. We explore hydraulic fracturing exemptions from federal regulations, as well as current and future efforts to mandate chemical disclosure at the federal and state level. PMID:23552653

  1. Risk analysis of sulfites used as food additives in China.

    PubMed

    Zhang, Jian Bo; Zhang, Hong; Wang, Hua Li; Zhang, Ji Yue; Luo, Peng Jie; Zhu, Lei; Wang, Zhu Tian

    2014-02-01

    This study was to analyze the risk of sulfites in food consumed by the Chinese people and assess the health protection capability of maximum-permitted level (MPL) of sulfites in GB 2760-2011. Sulfites as food additives are overused or abused in many food categories. When the MPL in GB 2760-2011 was used as sulfites content in food, the intake of sulfites in most surveyed populations was lower than the acceptable daily intake (ADI). Excess intake of sulfites was found in all the surveyed groups when a high percentile of sulfites in food was in taken. Moreover, children aged 1-6 years are at a high risk to intake excess sulfites. The primary cause for the excess intake of sulfites in Chinese people is the overuse and abuse of sulfites by the food industry. The current MPL of sulfites in GB 2760-2011 protects the health of most populations.

  2. Bioinformatics in Africa: The Rise of Ghana?

    PubMed

    Karikari, Thomas K

    2015-09-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  3. Bioinformatics in Africa: The Rise of Ghana?

    PubMed Central

    Karikari, Thomas K.

    2015-01-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  4. Bioinformatics by Example: From Sequence to Target

    NASA Astrophysics Data System (ADS)

    Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

    2002-12-01

    With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.

  5. Rapid Bioinformatic Identification of Thermostabilizing Mutations

    PubMed Central

    Sauer, David B.; Karpowich, Nathan K.; Song, Jin Mei; Wang, Da-Neng

    2015-01-01

    Ex vivo stability is a valuable protein characteristic but is laborious to improve experimentally. In addition to biopharmaceutical and industrial applications, stable protein is important for biochemical and structural studies. Taking advantage of the large number of available genomic sequences and growth temperature data, we present two bioinformatic methods to identify a limited set of amino acids or positions that likely underlie thermostability. Because these methods allow thousands of homologs to be examined in silico, they have the advantage of providing both speed and statistical power. Using these methods, we introduced, via mutation, amino acids from thermoadapted homologs into an exemplar mesophilic membrane protein, and demonstrated significantly increased thermostability while preserving protein activity. PMID:26445442

  6. Technical phosphoproteomic and bioinformatic tools useful in cancer research

    PubMed Central

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  7. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    PubMed

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  8. Bioinformatics and the allergy assessment of agricultural biotechnology products: industry practices and recommendations.

    PubMed

    Ladics, Gregory S; Cressman, Robert F; Herouet-Guicheney, Corinne; Herman, Rod A; Privalle, Laura; Song, Ping; Ward, Jason M; McClain, Scott

    2011-06-01

    Bioinformatic tools are being increasingly utilized to evaluate the degree of similarity between a novel protein and known allergens within the context of a larger allergy safety assessment process. Importantly, bioinformatics is not a predictive analysis that can determine if a novel protein will ''become" an allergen, but rather a tool to assess whether the protein is a known allergen or is potentially cross-reactive with an existing allergen. Bioinformatic tools are key components of the 2009 CodexAlimentarius Commission's weight-of-evidence approach, which encompasses a variety of experimental approaches for an overall assessment of the allergenic potential of a novel protein. Bioinformatic search comparisons between novel protein sequences, as well as potential novel fusion sequences derived from the genome and transgene, and known allergens are required by all regulatory agencies that assess the safety of genetically modified (GM) products. The objective of this paper is to identify opportunities for consensus in the methods of applying bioinformatics and to outline differences that impact a consistent and reliable allergy safety assessment. The bioinformatic comparison process has some critical features, which are outlined in this paper. One of them is a curated, publicly available and well-managed database with known allergenic sequences. In this paper, the best practices, scientific value, and food safety implications of bioinformatic analyses, as they are applied to GM food crops are discussed. Recommendations for conducting bioinformatic analysis on novel food proteins for potential cross-reactivity to known allergens are also put forth.

  9. [Bioinformatics: a key role in oncology].

    PubMed

    Olivier, Timothée; Chappuis, Pierre; Tsantoulis, Petros

    2016-05-18

    Bioinformatics is essential in clinical oncology and research. Combining biology, computer science and mathematics, bioinformatics aims to derive useful information from clinical and biological data, often poorly structured, at a large scale. Bioinformatics approaches have reclassified certain cancers based on their molecular and biological presentation, improving treatment selection. Many molecular signatures have been developed and, after validation, some are now usable in clinical practice. Other applications could facilitate daily practice, reduce the risk of error and increase the precision of medical decision-making. Bioinformatics must evolve in accordance with ethical considerations and requires multidisciplinary collaboration. Its application depends on a sound technical foundation that meets strict quality requirements.

  10. [Bioinformatics: a key role in oncology].

    PubMed

    Olivier, Timothée; Chappuis, Pierre; Tsantoulis, Petros

    2016-05-18

    Bioinformatics is essential in clinical oncology and research. Combining biology, computer science and mathematics, bioinformatics aims to derive useful information from clinical and biological data, often poorly structured, at a large scale. Bioinformatics approaches have reclassified certain cancers based on their molecular and biological presentation, improving treatment selection. Many molecular signatures have been developed and, after validation, some are now usable in clinical practice. Other applications could facilitate daily practice, reduce the risk of error and increase the precision of medical decision-making. Bioinformatics must evolve in accordance with ethical considerations and requires multidisciplinary collaboration. Its application depends on a sound technical foundation that meets strict quality requirements. PMID:27424424

  11. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches.

    PubMed

    Ozyigit, Ibrahim I; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y; Koc, Ibrahim; Öztürk, Münir X; Anjum, Naser A

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  12. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches

    PubMed Central

    Ozyigit, Ibrahim I.; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y.; Koc, Ibrahim; Öztürk, Münir X.; Anjum, Naser A.

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  13. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    PubMed

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-01

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.

  14. Additional challenges for uncertainty analysis in river engineering

    NASA Astrophysics Data System (ADS)

    Berends, Koen; Warmink, Jord; Hulscher, Suzanne

    2016-04-01

    the proposed intervention. The implicit assumption underlying such analysis is that both models are commensurable. We hypothesize that they are commensurable only to a certain extent. In an idealised study we have demonstrated that prediction performance loss should be expected with increasingly large engineering works. When accounting for parametric uncertainty of floodplain roughness in model identification, we see uncertainty bounds for predicted effects of interventions increase with increasing intervention scale. Calibration of these types of models therefore seems to have a shelf-life, beyond which calibration does not longer improves prediction. Therefore a qualification scheme for model use is required that can be linked to model validity. In this study, we characterize model use along three dimensions: extrapolation (using the model with different external drivers), extension (using the model for different output or indicators) and modification (using modified models). Such use of models is expected to have implications for the applicability of surrogating modelling for efficient uncertainty analysis as well, which is recommended for future research. Warmink, J. J.; Straatsma, M. W.; Huthoff, F.; Booij, M. J. & Hulscher, S. J. M. H. 2013. Uncertainty of design water levels due to combined bed form and vegetation roughness in the Dutch river Waal. Journal of Flood Risk Management 6, 302-318 . DOI: 10.1111/jfr3.12014

  15. Kinetic analysis of microbial respiratory response to substrate addition

    NASA Astrophysics Data System (ADS)

    Blagodatskaya, Evgenia; Blagodatsky, Sergey; Yuyukina, Tatayna; Kuzyakov, Yakov

    2010-05-01

    Heterotrophic component of CO2 emitted from soil is mainly due to the respiratory activity of soil microorganisms. Field measurements of microbial respiration can be used for estimation of C-budget in soil, while laboratory estimation of respiration kinetics allows the elucidation of mechanisms of soil C sequestration. Physiological approaches based on 1) time-dependent or 2) substrate-dependent respiratory response of soil microorganisms decomposing the organic substrates allow to relate the functional properties of soil microbial community with decomposition rates of soil organic matter. We used a novel methodology combining (i) microbial growth kinetics and (ii) enzymes affinity to the substrate to show the shift in functional properties of the soil microbial community after amendments with substrates of contrasting availability. We combined the application of 14C labeled glucose as easily available C source to soil with natural isotope labeling of old and young soil SOM. The possible contribution of two processes: isotopic fractionation and preferential substrate utilization to the shifts in δ13C during SOM decomposition in soil after C3-C4 vegetation change was evaluated. Specific growth rate (µ) of soil microorganisms was estimated by fitting the parameters of the equation v(t) = A + B * exp(µ*t), to the measured CO2 evolution rate (v(t)) after glucose addition, and where A is the initial rate of non-growth respiration, B - initial rate of the growing fraction of total respiration. Maximal mineralization rate (Vmax), substrate affinity of microbial enzymes (Ks) and substrate availability (Sn) were determined by Michaelis-Menten kinetics. To study the effect of plant originated C on δ13C signature of SOM we compared the changes in isotopic composition of different C pools in C3 soil under grassland with C3-C4 soil where C4 plant Miscanthus giganteus was grown for 12 years on the plot after grassland. The shift in 13δ C caused by planting of M. giganteus

  16. Evolving Strategies for the Incorporation of Bioinformatics Within the Undergraduate Cell Biology Curriculum

    PubMed Central

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489

  17. Bioinformatics Analysis Reveals Distinct Molecular Characteristics of Hepatitis B-Related Hepatocellular Carcinomas from Very Early to Advanced Barcelona Clinic Liver Cancer Stages.

    PubMed

    Kong, Fan-Yun; Wei, Xiao; Zhou, Kai; Hu, Wei; Kou, Yan-Bo; You, Hong-Juan; Liu, Xiao-Mei; Zheng, Kui-Yang; Tang, Ren-Xian

    2016-01-01

    Hepatocellular carcinoma (HCC)is the fifth most common malignancy associated with high mortality. One of the risk factors for HCC is chronic hepatitis B virus (HBV) infection. The treatment strategy for the disease is dependent on the stage of HCC, and the Barcelona clinic liver cancer (BCLC) staging system is used in most HCC cases. However, the molecular characteristics of HBV-related HCC in different BCLC stages are still unknown. Using GSE14520 microarray data from HBV-related HCC cases with BCLC stages from 0 (very early stage) to C (advanced stage) in the gene expression omnibus (GEO) database, differentially expressed genes (DEGs), including common DEGs and unique DEGs in different BCLC stages, were identified. These DEGs were located on different chromosomes. The molecular functions and biology pathways of DEGs were identified by gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and the interactome networks of DEGs were constructed using the NetVenn online tool. The results revealed that both common DEGs and stage-specific DEGs were associated with various molecular functions and were involved in special biological pathways. In addition, several hub genes were found in the interactome networks of DEGs. The identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of HBV-related HCC through the different BCLC stages, and might be used as staging biomarkers or molecular targets for the treatment of HCC with HBV infection. PMID:27454179

  18. Bioinformatics Analysis Reveals Distinct Molecular Characteristics of Hepatitis B-Related Hepatocellular Carcinomas from Very Early to Advanced Barcelona Clinic Liver Cancer Stages

    PubMed Central

    Hu, Wei; Kou, Yan-Bo; You, Hong-Juan; Liu, Xiao-Mei; Zheng, Kui-Yang; Tang, Ren-Xian

    2016-01-01

    Hepatocellular carcinoma (HCC)is the fifth most common malignancy associated with high mortality. One of the risk factors for HCC is chronic hepatitis B virus (HBV) infection. The treatment strategy for the disease is dependent on the stage of HCC, and the Barcelona clinic liver cancer (BCLC) staging system is used in most HCC cases. However, the molecular characteristics of HBV-related HCC in different BCLC stages are still unknown. Using GSE14520 microarray data from HBV-related HCC cases with BCLC stages from 0 (very early stage) to C (advanced stage) in the gene expression omnibus (GEO) database, differentially expressed genes (DEGs), including common DEGs and unique DEGs in different BCLC stages, were identified. These DEGs were located on different chromosomes. The molecular functions and biology pathways of DEGs were identified by gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and the interactome networks of DEGs were constructed using the NetVenn online tool. The results revealed that both common DEGs and stage-specific DEGs were associated with various molecular functions and were involved in special biological pathways. In addition, several hub genes were found in the interactome networks of DEGs. The identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of HBV-related HCC through the different BCLC stages, and might be used as staging biomarkers or molecular targets for the treatment of HCC with HBV infection. PMID:27454179

  19. Distribution of cold adaptation proteins in microbial mats in Lake Joyce, Antarctica: Analysis of metagenomic data by using two bioinformatics tools.

    PubMed

    Koo, Hyunmin; Hakim, Joseph A; Fisher, Phillip R E; Grueneberg, Alexander; Andersen, Dale T; Bej, Asim K

    2016-01-01

    In this study, we report the distribution and abundance of cold-adaptation proteins in microbial mat communities in the perennially ice-covered Lake Joyce, located in the McMurdo Dry Valleys, Antarctica. We have used MG-RAST and R code bioinformatics tools on Illumina HiSeq2000 shotgun metagenomic data and compared the filtering efficacy of these two methods on cold-adaptation proteins. Overall, the abundance of cold-shock DEAD-box protein A (CSDA), antifreeze proteins (AFPs), fatty acid desaturase (FAD), trehalose synthase (TS), and cold-shock family of proteins (CSPs) were present in all mat samples at high, moderate, or low levels, whereas the ice nucleation protein (INP) was present only in the ice and bulbous mat samples at insignificant levels. Considering the near homogeneous temperature profile of Lake Joyce (0.08-0.29 °C), the distribution and abundance of these proteins across various mat samples predictively correlated with known functional attributes necessary for microbial communities to thrive in this ecosystem. The comparison of the MG-RAST and the R code methods showed dissimilar occurrences of the cold-adaptation protein sequences, though with insignificant ANOSIM (R = 0.357; p-value = 0.012), ADONIS (R(2) = 0.274; p-value = 0.03) and STAMP (p-values = 0.521-0.984) statistical analyses. Furthermore, filtering targeted sequences using the R code accounted for taxonomic groups by avoiding sequence redundancies, whereas the MG-RAST provided total counts resulting in a higher sequence output. The results from this study revealed for the first time the distribution of cold-adaptation proteins in six different types of microbial mats in Lake Joyce, while suggesting a simpler and more manageable user-defined method of R code, as compared to a web-based MG-RAST pipeline. PMID:26578243

  20. Distribution of cold adaptation proteins in microbial mats in Lake Joyce, Antarctica: Analysis of metagenomic data by using two bioinformatics tools.

    PubMed

    Koo, Hyunmin; Hakim, Joseph A; Fisher, Phillip R E; Grueneberg, Alexander; Andersen, Dale T; Bej, Asim K

    2016-01-01

    In this study, we report the distribution and abundance of cold-adaptation proteins in microbial mat communities in the perennially ice-covered Lake Joyce, located in the McMurdo Dry Valleys, Antarctica. We have used MG-RAST and R code bioinformatics tools on Illumina HiSeq2000 shotgun metagenomic data and compared the filtering efficacy of these two methods on cold-adaptation proteins. Overall, the abundance of cold-shock DEAD-box protein A (CSDA), antifreeze proteins (AFPs), fatty acid desaturase (FAD), trehalose synthase (TS), and cold-shock family of proteins (CSPs) were present in all mat samples at high, moderate, or low levels, whereas the ice nucleation protein (INP) was present only in the ice and bulbous mat samples at insignificant levels. Considering the near homogeneous temperature profile of Lake Joyce (0.08-0.29 °C), the distribution and abundance of these proteins across various mat samples predictively correlated with known functional attributes necessary for microbial communities to thrive in this ecosystem. The comparison of the MG-RAST and the R code methods showed dissimilar occurrences of the cold-adaptation protein sequences, though with insignificant ANOSIM (R = 0.357; p-value = 0.012), ADONIS (R(2) = 0.274; p-value = 0.03) and STAMP (p-values = 0.521-0.984) statistical analyses. Furthermore, filtering targeted sequences using the R code accounted for taxonomic groups by avoiding sequence redundancies, whereas the MG-RAST provided total counts resulting in a higher sequence output. The results from this study revealed for the first time the distribution of cold-adaptation proteins in six different types of microbial mats in Lake Joyce, while suggesting a simpler and more manageable user-defined method of R code, as compared to a web-based MG-RAST pipeline.

  1. An evaluation of ontology exchange languages for bioinformatics.

    PubMed

    McEntire, R; Karp, P; Abernethy, N; Benton, D; Helt, G; DeJongh, M; Kent, R; Kosky, A; Lewis, S; Hodnett, D; Neumann, E; Olken, F; Pathak, D; Tarczy-Hornoch, P; Toldo, L; Topaloglou, T

    2000-01-01

    Ontologies are specifications of the concepts in a given field, and of the relationships among those concepts. The development of ontologies for molecular-biology information and the sharing of those ontologies within the bioinformatics community are central problems in bioinformatics. If the bioinformatics community is to share ontologies effectively, ontologies must be exchanged in a form that uses standardized syntax and semantics. This paper reports on an effort among the authors to evaluate alternative ontology-exchange languages, and to recommend one or more languages for use within the larger bioinformatics community. The study selected a set of candidate languages, and defined a set of capabilities that the ideal ontology-exchange language should satisfy. The study scored the languages according to the degree to which they satisfied each capability. In addition, the authors performed several ontology-exchange experiments with the two languages that received the highest scores: OML and Ontolingua. The result of those experiments, and the main conclusion of this study, was that the frame-based semantic model of Ontolingua is preferable to the conceptual graph model of OML, but that the XML-based syntax of OML is preferable to the Lisp-based syntax of Ontolingua. PMID:10977085

  2. Effective epitope identification employing phylogenetic, mutational variability, sequence entropy, and correlated mutation analysis targeting NS5B protein of hepatitis C virus: from bioinformatics to therapeutics.

    PubMed

    Meshram, Rohan J; Gacche, Rajesh N

    2015-08-01

    Hepatitis C virus (HCV) is considered as a foremost cause affecting numerous human liver-related disorders. An effective immuno-prophylactic measure (like stable vaccine) is still unavailable for HCV. We perform an in silico analysis of nonstructural protein 5B (NS5B) based CD4 and CD8 epitopes that might be implicated in improvement of treatment strategies for efficient vaccine development programs against HCV. Here, we report on effective utilization of knowledge obtained from multiple sequence alignment and phylogenetic analysis for investigation and evaluation of candidate epitopes that have enormous potential to be used in formulating proficient vaccine, embracing multiple strains prevalent among major geographical locations. Mutational variability data discussed herein focus on discriminating the region under active evolutionary pressure from those having lower mutational potential in existing experimentally verified epitopes, thus, providing a concrete framework for designing an effective peptide-based vaccine against HCV. Additionally, we measured entropy distribution in NS5B residues and pinpoint the positions in epitopes that are more susceptible to mutations and, thus, account for virus strategy to evade the host immune system. Findings from this study are expected to add more details on the sequence and structural aspects of NS5B protein, ultimately facilitating our understanding about the pathophysiology of HCV and assisting advance studies on the function of NS5B antigen on the epitope level. We also report on the mutational crosstalk between functionally important coevolving residues, using correlated mutation analysis, and identify networks of coupled mutations that represent pathways of allosteric communication inside and among NS5B thumb, finger, and palm domains. PMID:25727409

  3. Rapid Development of Bioinformatics Education in China

    ERIC Educational Resources Information Center

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  4. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Cancer.gov

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  5. Biology in 'silico': The Bioinformatics Revolution.

    ERIC Educational Resources Information Center

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  6. A Mathematical Optimization Problem in Bioinformatics

    ERIC Educational Resources Information Center

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  7. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  8. Analyzing gene expression profiles in dilated cardiomyopathy via bioinformatics methods

    PubMed Central

    Wang, Liming; Zhu, L.; Luan, R.; Wang, L.; Fu, J.; Wang, X.; Sui, L.

    2016-01-01

    Dilated cardiomyopathy (DCM) is characterized by ventricular dilatation, and it is a common cause of heart failure and cardiac transplantation. This study aimed to explore potential DCM-related genes and their underlying regulatory mechanism using methods of bioinformatics. The gene expression profiles of GSE3586 were downloaded from Gene Expression Omnibus database, including 15 normal samples and 13 DCM samples. The differentially expressed genes (DEGs) were identified between normal and DCM samples using Limma package in R language. Pathway enrichment analysis of DEGs was then performed. Meanwhile, the potential transcription factors (TFs) and microRNAs (miRNAs) of these DEGs were predicted based on their binding sequences. In addition, DEGs were mapped to the cMap database to find the potential small molecule drugs. A total of 4777 genes were identified as DEGs by comparing gene expression profiles between DCM and control samples. DEGs were significantly enriched in 26 pathways, such as lymphocyte TarBase pathway and androgen receptor signaling pathway. Furthermore, potential TFs (SP1, LEF1, and NFAT) were identified, as well as potential miRNAs (miR-9, miR-200 family, and miR-30 family). Additionally, small molecules like isoflupredone and trihexyphenidyl were found to be potential therapeutic drugs for DCM. The identified DEGs (PRSS12 and FOXG1), potential TFs, as well as potential miRNAs, might be involved in DCM. PMID:27737314

  9. The 2016 Bioinformatics Open Source Conference (BOSC)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science. PMID:27781083

  10. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    SciTech Connect

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    courses or independent research projects requires infrastructure for organizing and assessing student work. Here, we present a new platform for faculty to keep current with the rapidly changing field of bioinformatics, the Integrated Microbial Genomes Annotation Collaboration Toolkit (IMG-ACT). It was developed by instructors from both research-intensive and predominately undergraduate institutions in collaboration with the Department of Energy-Joint Genome Institute (DOE-JGI) as a means to innovate and update undergraduate education and faculty development. The IMG-ACT program provides a cadre of tools, including access to a clearinghouse of genome sequences, bioinformatics databases, data storage, instructor course management, and student notebooks for organizing the results of their bioinformatic investigations. In the process, IMG-ACT makes it feasible to provide undergraduate research opportunities to a greater number and diversity of students, in contrast to the traditional mentor-to-student apprenticeship model for undergraduate research, which can be too expensive and time-consuming to provide for every undergraduate. The IMG-ACT serves as the hub for the network of faculty and students that use the system for microbial genome analysis. Open access of the IMG-ACT infrastructure to participating schools ensures that all types of higher education institutions can utilize it. With the infrastructure in place, faculty can focus their efforts on the pedagogy of bioinformatics, involvement of students in research, and use of this tool for their own research agenda. What the original faculty members of the IMG-ACT development team present here is an overview of how the IMG-ACT program has affected our development in terms of teaching and research with the hopes that it will inspire more faculty to get involved.

  11. Computational and Bioinformatics Frameworks for Next-Generation Whole Exome and Genome Sequencing

    PubMed Central

    Dolled-Filhart, Marisa P.; Lee, Michael; Ou-yang, Chih-wen; Haraksingh, Rajini Rani; Lin, Jimmy Cheng-Ho

    2013-01-01

    It has become increasingly apparent that one of the major hurdles in the genomic age will be the bioinformatics challenges of next-generation sequencing. We provide an overview of a general framework of bioinformatics analysis. For each of the three stages of (1) alignment, (2) variant calling, and (3) filtering and annotation, we describe the analysis required and survey the different software packages that are used. Furthermore, we discuss possible future developments as data sources grow and highlight opportunities for new bioinformatics tools to be developed. PMID:23365548

  12. Bioinformatics Education—Perspectives and Challenges out of Africa

    PubMed Central

    Adebiyi, Ezekiel F.; Alzohairy, Ahmed M.; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J.; Panji, Sumir; Patterton, Hugh-G.

    2015-01-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  13. Broad issues to consider for library involvement in bioinformatics*

    PubMed Central

    Geer, Renata C.

    2006-01-01

    Background: The information landscape in biological and medical research has grown far beyond literature to include a wide variety of databases generated by research fields such as molecular biology and genomics. The traditional role of libraries to collect, organize, and provide access to information can expand naturally to encompass these new data domains. Methods: This paper discusses the current and potential role of libraries in bioinformatics using empirical evidence and experience from eleven years of work in user services at the National Center for Biotechnology Information. Findings: Medical and science libraries over the last decade have begun to establish educational and support programs to address the challenges users face in the effective and efficient use of a plethora of molecular biology databases and retrieval and analysis tools. As more libraries begin to establish a role in this area, the issues they face include assessment of user needs and skills, identification of existing services, development of plans for new services, recruitment and training of specialized staff, and establishment of collaborations with bioinformatics centers at their institutions. Conclusions: Increasing library involvement in bioinformatics can help address information needs of a broad range of students, researchers, and clinicians and ultimately help realize the power of bioinformatics resources in making new biological discoveries. PMID:16888662

  14. Robust enzyme design: bioinformatic tools for improved protein stability.

    PubMed

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.

  15. Bioinformatic Analysis of Chlamydia trachomatis Polymorphic Membrane Proteins PmpE, PmpF, PmpG and PmpH as Potential Vaccine Antigens

    PubMed Central

    Nunes, Alexandra; Gomes, João P.; Karunakaran, Karuna P.; Brunham, Robert C.

    2015-01-01

    Chlamydia trachomatis is the most important infectious cause of infertility in women with important implications in public health and for which a vaccine is urgently needed. Recent immunoproteomic vaccine studies found that four polymorphic membrane proteins (PmpE, PmpF, PmpG and PmpH) are immunodominant, recognized by various MHC class II haplotypes and protective in mouse models. In the present study, we aimed to evaluate genetic and protein features of Pmps (focusing on the N-terminal 600 amino acids where MHC class II epitopes were mapped) in order to understand antigen variation that may emerge following vaccine induced immune selection. We used several bioinformatics platforms to study: i) Pmps’ phylogeny and genetic polymorphism; ii) the location and distribution of protein features (GGA(I, L)/FxxN motifs and cysteine residues) that may impact pathogen-host interactions and protein conformation; and iii) the existence of phase variation mechanisms that may impact Pmps’ expression. We used a well-characterized collection of 53 fully-sequenced strains that represent the C. trachomatis serovars associated with the three disease groups: ocular (N=8), epithelial-genital (N=25) and lymphogranuloma venereum (LGV) (N=20). We observed that PmpF and PmpE are highly polymorphic between LGV and epithelial-genital strains, and also within populations of the latter. We also found heterogeneous representation among strains for GGA(I, L)/FxxN motifs and cysteine residues, suggesting possible alterations in adhesion properties, tissue specificity and immunogenicity. PmpG and, to a lesser extent, PmpH revealed low polymorphism and high conservation of protein features among the genital strains (including the LGV group). Uniquely among the four Pmps, pmpG has regulatory sequences suggestive of phase variation. In aggregate, the results suggest that PmpG may be the lead vaccine candidate because of sequence conservation but may need to be paired with another protective

  16. Structural Bioinformatics and Protein Docking Analysis of the Molecular Chaperone-Kinase Interactions: Towards Allosteric Inhibition of Protein Kinases by Targeting the Hsp90-Cdc37 Chaperone Machinery

    PubMed Central

    Lawless, Nathan; Blacklock, Kristin; Berrigan, Elizabeth; Verkhivker, Gennady

    2013-01-01

    A fundamental role of the Hsp90-Cdc37 chaperone system in mediating maturation of protein kinase clients and supporting kinase functional activity is essential for the integrity and viability of signaling pathways involved in cell cycle control and organism development. Despite significant advances in understanding structure and function of molecular chaperones, the molecular mechanisms and guiding principles of kinase recruitment to the chaperone system are lacking quantitative characterization. Structural and thermodynamic characterization of Hsp90-Cdc37 binding with protein kinase clients by modern experimental techniques is highly challenging, owing to a transient nature of chaperone-mediated interactions. In this work, we used experimentally-guided protein docking to probe the allosteric nature of the Hsp90-Cdc37 binding with the cyclin-dependent kinase 4 (Cdk4) kinase clients. The results of docking simulations suggest that the kinase recognition and recruitment to the chaperone system may be primarily determined by Cdc37 targeting of the N-terminal kinase lobe. The interactions of Hsp90 with the C-terminal kinase lobe may provide additional “molecular brakes” that can lock (or unlock) kinase from the system during client loading (release) stages. The results of this study support a central role of the Cdc37 chaperone in recognition and recruitment of the kinase clients. Structural analysis may have useful implications in developing strategies for allosteric inhibition of protein kinases by targeting the Hsp90-Cdc37 chaperone machinery. PMID:24287464

  17. NGS for the Masses: Empowering Biologists to Improve Bioinformatics Productivity ( 7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Qaadri, Kashef [Biomatters

    2016-07-12

    Kashef Qaadri on "NGS for the Masses: Empowering biologists to improve bioinformatic productivity" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  18. NGS for the Masses: Empowering Biologists to Improve Bioinformatics Productivity ( 7th Annual SFAF Meeting, 2012)

    SciTech Connect

    Qaadri, Kashef

    2012-06-01

    Kashef Qaadri on "NGS for the Masses: Empowering biologists to improve bioinformatic productivity" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  19. Bioinformatics in the secondary science classroom: A study of state content standards and students' perceptions of, and performance in, bioinformatics lessons

    NASA Astrophysics Data System (ADS)

    Wefer, Stephen H.

    The proliferation of bioinformatics in modern Biology marks a new revolution in science, which promises to influence science education at all levels. This thesis examined state standards for content that articulated bioinformatics, and explored secondary students' affective and cognitive perceptions of, and performance in, a bioinformatics mini-unit. The results are presented as three studies. The first study analyzed secondary science standards of 49 U.S States (Iowa has no science framework) and the District of Columbia for content related to bioinformatics at the introductory high school biology level. The bionformatics content of each state's Biology standards were categorized into nine areas and the prevalence of each area documented. The nine areas were: The Human Genome Project, Forensics, Evolution, Classification, Nucleotide Variations, Medicine, Computer Use, Agriculture/Food Technology, and Science Technology and Society/Socioscientific Issues (STS/SSI). Findings indicated a generally low representation of bioinformatics related content, which varied substantially across the different areas. Recommendations are made for reworking existing standards to incorporate bioinformatics and to facilitate the goal of promoting science literacy in this emerging new field among secondary school students. The second study examined thirty-two students' affective responses to, and content mastery of, a two-week bioinformatics mini-unit. The findings indicate that the students generally were positive relative to their interest level, the usefulness of the lessons, the difficulty level of the lessons, likeliness to engage in additional bioinformatics, and were overall successful on the assessments. A discussion of the results and significance is followed by suggestions for future research and implementation for transferability. The third study presents a case study of individual differences among ten secondary school students, whose cognitive and affective percepts were

  20. Computational biology of genome expression and regulation--a review of microarray bioinformatics.

    PubMed

    Wang, Junbai

    2008-01-01

    Microarray technology is being used widely in various biomedical research areas; the corresponding microarray data analysis is an essential step toward the best utilizing of array technologies. Here we review two components of the microarray data analysis: a low level of microarray data analysis that emphasizes the designing, the quality control, and the preprocessing of microarray experiments, then a high level of microarray data analysis that focuses on the domain-specific microarray applications such as tumor classification, biomarker prediction, analyzing array CGH experiments, and reverse engineering of gene expression networks. Additionally, we will review the recent development of building a predictive model in genome expression and regulation studies. This review may help biologists grasp a basic knowledge of microarray bioinformatics as well as its potential impact on the future evolvement of biomedical research fields.

  1. Computational biology and bioinformatics in Nigeria.

    PubMed

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  2. Computational Biology and Bioinformatics in Nigeria

    PubMed Central

    Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-01-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310

  3. Large-scale bioinformatic analysis of the regulation of the disease resistance NBS gene family by microRNAs in Poaceae.

    PubMed

    Habachi-Houimli, Yosra; Khalfallah, Yosra; Makni, Hanem; Makni, Mohamed; Bouktila, Dhia

    2016-01-01

    In the present study, we have screened 71, 713, 525, 119 and 241 mature miRNA variants from Hordeum vulgare, Oryza sativa, Brachypodium distachyon, Triticum aestivum, and Sorghum bicolor, respectively, and classified them with respect to their conservation status and expression levels. These Poaceae non-redundant miRNA species (1,669) were distributed over a total of 625 MIR families, among which only 54 were conserved across two or more plant species, confirming the relatively recent evolutionary differentiation of miRNAs in grasses. On the other hand, we have used 257 H. vulgare, 286T. aestivum, 119 B. distachyon, 269 O. sativa, and 139 S. bicolor NBS domains, which were either mined directly from the annotated proteomes, or predicted from whole genome sequence assemblies. The hybridization potential between miRNAs and their putative NBS genes targets was analyzed, revealing that at least 454 NBS genes from all five Poaceae were potentially regulated by 265 distinct miRNA species, most of them expressed in leaves and predominantly co-expressed in additional tissues. Based on gene ontology, we could assign these probable miRNA target genes to 16 functional groups, among which three conferring resistance to bacteria (Rpm1, Xa1 and Rps2), and 13 groups of resistance to fungi (Rpp8,13, Rp3, Tsn1, Lr10, Rps1-k-1, Pm3, Rpg5, and MLA1,6,10,12,13). The results of the present analysis provide a large-scale platform for a better understanding of biological control strategies of disease resistance genes in Poaceae, and will serve as an important starting point for enhancing crop disease resistance improvement by means of transgenic lines with artificial miRNAs.

  4. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  5. When cloud computing meets bioinformatics: a review.

    PubMed

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  6. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  7. Data Mining for Grammatical Inference with Bioinformatics Criteria

    NASA Astrophysics Data System (ADS)

    López, Vivian F.; Aguilar, Ramiro; Alonso, Luis; Moreno, María N.; Corchado, Juan M.

    In this paper we describe both theoretical and practical results of a novel data mining process that combines hybrid techniques of association analysis and classical sequentiation algorithms of genomics to generate grammatical structures of a specific language. We used an application of a compilers generator system that allows the development of a practical application within the area of grammarware, where the concepts of the language analysis are applied to other disciplines, such as Bioinformatic. The tool allows the complexity of the obtained grammar to be measured automatically from textual data. A technique of incremental discovery of sequential patterns is presented to obtain simplified production rules, and compacted with bioinformatics criteria to make up a grammar.

  8. Bioinformatics tools for small genomes, such as hepatitis B virus.

    PubMed

    Bell, Trevor G; Kramvis, Anna

    2015-02-01

    DNA sequence analysis is undertaken in many biological research laboratories. The workflow consists of several steps involving the bioinformatic processing of biological data. We have developed a suite of web-based online bioinformatic tools to assist with processing, analysis and curation of DNA sequence data. Most of these tools are genome-agnostic, with two tools specifically designed for hepatitis B virus sequence data. Tools in the suite are able to process sequence data from Sanger sequencing, ultra-deep amplicon resequencing (pyrosequencing) and chromatograph (trace files), as appropriate. The tools are available online at no cost and are aimed at researchers without specialist technical computer knowledge. The tools can be accessed at http://hvdr.bioinf.wits.ac.za/SmallGenomeTools, and the source code is available online at https://github.com/DrTrevorBell/SmallGenomeTools. PMID:25690798

  9. Bioinformatic challenges in targeted proteomics.

    PubMed

    Reker, Daniel; Malmström, Lars

    2012-09-01

    Selected reaction monitoring mass spectrometry is an emerging targeted proteomics technology that allows for the investigation of complex protein samples with high sensitivity and efficiency. It requires extensive knowledge about the sample for the many parameters needed to carry out the experiment to be set appropriately. Most studies today rely on parameter estimation from prior studies, public databases, or from measuring synthetic peptides. This is efficient and sound, but in absence of prior data, de novo parameter estimation is necessary. Computational methods can be used to create an automated framework to address this problem. However, the number of available applications is still small. This review aims at giving an orientation on the various bioinformatical challenges. To this end, we state the problems in classical machine learning and data mining terms, give examples of implemented solutions and provide some room for alternatives. This will hopefully lead to an increased momentum for the development of algorithms and serve the needs of the community for computational methods. We note that the combination of such methods in an assisted workflow will ease both the usage of targeted proteomics in experimental studies as well as the further development of computational approaches. PMID:22866949

  10. Evolution in bioinformatic resources: 2009 update on the Bioinformatics Links Directory.

    PubMed

    Brazas, Michelle D; Yamada, Joseph Tadashi; Ouellette, B F Francis

    2009-07-01

    All of the life science research web servers published in this and previous issues of Nucleic Acids Research, together with other useful tools, databases and resources for bioinformatics and molecular biology research are freely accessible online through the Bioinformatics Links Directory, http://bioinformatics.ca/links_directory/. Entirely dependent on user feedback and community input, the Bioinformatics Links Directory exemplifies an open access research tool and resource. With 112 websites featured in the July 2009 Web Server Issue of Nucleic Acids Research, the 2009 update brings the total number of servers listed in the Bioinformatics Links Directory close to an impressive 1400 links. A complete list of all links listed in this Nucleic Acids Research 2009 Web Server Issue can be accessed online at http://bioinfomatics.ca/links_directory/narweb2009/. The 2009 update of the Bioinformatics Links Directory, which includes the Web Server list and summaries, is also available online at the Nucleic Acids Research website, http://nar.oxfordjournals.org/.

  11. Dried blood spot analysis of creatinine with LC-MS/MS in addition to immunosuppressants analysis.

    PubMed

    Koster, Remco A; Greijdanus, Ben; Alffenaar, Jan-Willem C; Touw, Daan J

    2015-02-01

    In order to monitor creatinine levels or to adjust the dosage of renally excreted or nephrotoxic drugs, the analysis of creatinine in dried blood spots (DBS) could be a useful addition to DBS analysis. We developed a LC-MS/MS method for the analysis of creatinine in the same DBS extract that was used for the analysis of tacrolimus, sirolimus, everolimus, and cyclosporine A in transplant patients with the use of Whatman FTA DMPK-C cards. The method was validated using three different strategies: a seven-point calibration curve using the intercept of the calibration to correct for the natural presence of creatinine in reference samples, a one-point calibration curve at an extremely high concentration in order to diminish the contribution of the natural presence of creatinine, and the use of creatinine-[(2)H3] with an eight-point calibration curve. The validated range for creatinine was 120 to 480 μmol/L (seven-point calibration curve), 116 to 7000 μmol/L (1-point calibration curve), and 1.00 to 400.0 μmol/L for creatinine-[(2)H3] (eight-point calibration curve). The precision and accuracy results for all three validations showed a maximum CV of 14.0% and a maximum bias of -5.9%. Creatinine in DBS was found stable at ambient temperature and 32 °C for 1 week and at -20 °C for 29 weeks. Good correlations were observed between patient DBS samples and routine enzymatic plasma analysis and showed the capability of the DBS method to be used as an alternative for creatinine plasma measurement.

  12. Bioinformatics in Italy: BITS2011, the Eighth Annual Meeting of the Italian Society of Bioinformatics

    PubMed Central

    2012-01-01

    The BITS2011 meeting, held in Pisa on June 20-22, 2011, brought together more than 120 Italian researchers working in the field of Bioinformatics, as well as students in Bioinformatics, Computational Biology, Biology, Computer Sciences, and Engineering, representing a landscape of Italian bioinformatics research. This preface provides a brief overview of the meeting and introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement. PMID:22536954

  13. Managing, Analysing, and Integrating Big Data in Medical Bioinformatics: Open Problems and Future Perspectives

    PubMed Central

    Merelli, Ivan; Pérez-Sánchez, Horacio; Gesing, Sandra; D'Agostino, Daniele

    2014-01-01

    The explosion of the data both in the biomedical research and in the healthcare systems demands urgent solutions. In particular, the research in omics sciences is moving from a hypothesis-driven to a data-driven approach. Healthcare is additionally always asking for a tighter integration with biomedical data in order to promote personalized medicine and to provide better treatments. Efficient analysis and interpretation of Big Data opens new avenues to explore molecular biology, new questions to ask about physiological and pathological states, and new ways to answer these open issues. Such analyses lead to better understanding of diseases and development of better and personalized diagnostics and therapeutics. However, such progresses are directly related to the availability of new solutions to deal with this huge amount of information. New paradigms are needed to store and access data, for its annotation and integration and finally for inferring knowledge and making it available to researchers. Bioinformatics can be viewed as the “glue” for all these processes. A clear awareness of present high performance computing (HPC) solutions in bioinformatics, Big Data analysis paradigms for computational biology, and the issues that are still open in the biomedical and healthcare fields represent the starting point to win this challenge. PMID:25254202

  14. Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

    PubMed Central

    Roche-Lima, Abiel; Thulasiram, Ruppa K.

    2016-01-01

    Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.

  15. Bioinformatics: Current practice and future challenges for life science education.

    PubMed

    Hack, Catherine; Kendall, Gary

    2005-03-01

    It is widely predicted that the application of high-throughput technologies to the quantification and identification of biological molecules will cause a paradigm shift in the life sciences. However, if the biosciences are to evolve from a predominantly descriptive discipline to an information science, practitioners will require enhanced skills in mathematics, computing, and statistical analysis. Universities have responded to the widely perceived skills gap primarily by developing masters programs in bioinformatics, resulting in a rapid expansion in the provision of postgraduate bioinformatics education. There is, however, a clear need to improve the quantitative and analytical skills of life science undergraduates. This article reviews the response of academia in the United Kingdom and proposes the learning outcomes that graduates should achieve to cope with the new biology. While the analysis discussed here uses the development of bioinformatics education in the United Kingdom as an illustrative example, it is hoped that the issues raised will resonate with all those involved in curriculum development in the life sciences.

  16. Experiences with workflows for automating data-intensive bioinformatics.

    PubMed

    Spjuth, Ola; Bongcam-Rudloff, Erik; Hernández, Guillermo Carrasco; Forer, Lukas; Giovacchini, Mario; Guimera, Roman Valls; Kallio, Aleksi; Korpelainen, Eija; Kańduła, Maciej M; Krachunov, Milko; Kreil, David P; Kulev, Ognyan; Łabaj, Paweł P; Lampa, Samuel; Pireddu, Luca; Schönherr, Sebastian; Siretskiy, Alexey; Vassilev, Dimitar

    2015-01-01

    High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution. PMID:26282399

  17. Data capture in bioinformatics: requirements and experiences with Pedro

    PubMed Central

    Jameson, Daniel; Garwood, Kevin; Garwood, Chris; Booth, Tim; Alper, Pinar; Oliver, Stephen G; Paton, Norman W

    2008-01-01

    Background The systematic capture of appropriately annotated experimental data is a prerequisite for most bioinformatics analyses. Data capture is required not only for submission of data to public repositories, but also to underpin integrated analysis, archiving, and sharing – both within laboratories and in collaborative projects. The widespread requirement to capture data means that data capture and annotation are taking place at many sites, but the small scale of the literature on tools, techniques and experiences suggests that there is work to be done to identify good practice and reduce duplication of effort. Results This paper reports on experience gained in the deployment of the Pedro data capture tool in a range of representative bioinformatics applications. The paper makes explicit the requirements that have recurred when capturing data in different contexts, indicates how these requirements are addressed in Pedro, and describes case studies that illustrate where the requirements have arisen in practice. Conclusion Data capture is a fundamental activity for bioinformatics; all biological data resources build on some form of data capture activity, and many require a blend of import, analysis and annotation. Recurring requirements in data capture suggest that model-driven architectures can be used to construct data capture infrastructures that can be rapidly configured to meet the needs of individual use cases. We have described how one such model-driven infrastructure, namely Pedro, has been deployed in representative case studies, and discussed the extent to which the model-driven approach has been effective in practice. PMID:18402673

  18. Identifiying human MHC supertypes using bioinformatic methods.

    PubMed

    Doytchinova, Irini A; Guan, Pingping; Flower, Darren R

    2004-04-01

    Classification of MHC molecules into supertypes in terms of peptide-binding specificities is an important issue, with direct implications for the development of epitope-based vaccines with wide population coverage. In view of extremely high MHC polymorphism (948 class I and 633 class II HLA alleles) the experimental solution of this task is presently impossible. In this study, we describe a bioinformatics strategy for classifying MHC molecules into supertypes using information drawn solely from three-dimensional protein structure. Two chemometric techniques-hierarchical clustering and principal component analysis-were used independently on a set of 783 HLA class I molecules to identify supertypes based on structural similarities and molecular interaction fields calculated for the peptide binding site. Eight supertypes were defined: A2, A3, A24, B7, B27, B44, C1, and C4. The two techniques gave 77% consensus, i.e., 605 HLA class I alleles were classified in the same supertype by both methods. The proposed strategy allowed "supertype fingerprints" to be identified. Thus, the A2 supertype fingerprint is Tyr(9)/Phe(9), Arg(97), and His(114) or Tyr(116); the A3-Tyr(9)/Phe(9)/Ser(9), Ile(97)/Met(97) and Glu(114) or Asp(116); the A24-Ser(9) and Met(97); the B7-Asn(63) and Leu(81); the B27-Glu(63) and Leu(81); for B44-Ala(81); the C1-Ser(77); and the C4-Asn(77). PMID:15034046

  19. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    PubMed

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China.

  20. Quantum Bio-Informatics IV

    NASA Astrophysics Data System (ADS)

    Accardi, Luigi; Freudenberg, Wolfgang; Ohya, Masanori

    2011-01-01

    The QP-DYN algorithms / L. Accardi, M. Regoli and M. Ohya -- Study of transcriptional regulatory network based on Cis module database / S. Akasaka ... [et al.] -- On Lie group-Lie algebra correspondences of unitary groups in finite von Neumann algebras / H. Ando, I. Ojima and Y. Matsuzawa -- On a general form of time operators of a Hamiltonian with purely discrete spectrum / A. Arai -- Quantum uncertainty and decision-making in game theory / M. Asano ... [et al.] -- New types of quantum entropies and additive information capacities / V. P. Belavkin -- Non-Markovian dynamics of quantum systems / D. Chruscinski and A. Kossakowski -- Self-collapses of quantum systems and brain activities / K.-H. Fichtner ... [et al.] -- Statistical analysis of random number generators / L. Accardi and M. Gabler -- Entangled effects of two consecutive pairs in residues and its use in alignment / T. Ham, K. Sato and M. Ohya -- The passage from digital to analogue in white noise analysis and applications / T. Hida -- Remarks on the degree of entanglement / D. Chruscinski ... [et al.] -- A completely discrete particle model derived from a stochastic partial differential equation by point systems / K.-H. Fichtner, K. Inoue and M. Ohya -- On quantum algorithm for exptime problem / S. Iriyama and M. Ohya -- On sufficient algebraic conditions for identification of quantum states / A. Jamiolkowski -- Concurrence and its estimations by entanglement witnesses / J. Jurkowski -- Classical wave model of quantum-like processing in brain / A. Khrennikov -- Entanglement mapping vs. quantum conditional probability operator / D. Chruscinski ... [et al.] -- Constructing multipartite entanglement witnesses / M. Michalski -- On Kadison-Schwarz property of quantum quadratic operators on M[symbol](C) / F. Mukhamedov and A. Abduganiev -- On phase transitions in quantum Markov chains on Cayley Tree / L. Accardi, F. Mukhamedov and M. Saburov -- Space(-time) emergence as symmetry breaking effect / I. Ojima

  1. GITIRBio: A Semantic and Distributed Service Oriented-Architecture for Bioinformatics Pipeline.

    PubMed

    Castillo, Luis F; López-Gartner, Germán; Isaza, Gustavo A; Sánchez, Mariana; Arango, Jeferson; Agudelo-Valencia, Daniel; Castaño, Sergio

    2015-05-20

    The need to process large quantities of data generated from genomic sequencing has resulted in a difficult task for life scientists who are not familiar with the use of command-line operations or developments in high performance computing and parallelization. This knowledge gap, along with unfamiliarity with necessary processes, can hinder the execution of data processing tasks. Furthermore, many of the commonly used bioinformatics tools for the scientific community are presented as isolated, unrelated entities that do not provide an integrated, guided, and assisted interaction with the scheduling facilities of computational resources or distribution, processing and mapping with runtime analysis. This paper presents the first approximation of a Web Services platform-based architecture (GITIRBio) that acts as a distributed front-end system for autonomous and assisted processing of parallel bioinformatics pipelines that has been validated using multiple sequences. Additionally, this platform allows integration with semantic repositories of genes for search annotations. GITIRBio is available at: http://c-head.ucaldas.edu.co:8080/gitirbio.

  2. Wnt-signalling pathways and microRNAs network in carcinogenesis: experimental and bioinformatics approaches.

    PubMed

    Onyido, Emenike K; Sweeney, Eloise; Nateri, Abdolrahman Shams

    2016-01-01

    Over the past few years, microRNAs (miRNAs) have not only emerged as integral regulators of gene expression at the post-transcriptional level but also respond to signalling molecules to affect cell function(s). miRNAs crosstalk with a variety of the key cellular signalling networks such as Wnt, transforming growth factor-β and Notch, control stem cell activity in maintaining tissue homeostasis, while if dysregulated contributes to the initiation and progression of cancer. Herein, we overview the molecular mechanism(s) underlying the crosstalk between Wnt-signalling components (canonical and non-canonical) and miRNAs, as well as changes in the miRNA/Wnt-signalling components observed in the different forms of cancer. Furthermore, the fundamental understanding of miRNA-mediated regulation of Wnt-signalling pathway and vice versa has been significantly improved by high-throughput genomics and bioinformatics technologies. Whilst, these approaches have identified a number of specific miRNA(s) that function as oncogenes or tumour suppressors, additional analyses will be necessary to fully unravel the links among conserved cellular signalling pathways and miRNAs and their potential associated components in cancer, thereby creating therapeutic avenues against tumours. Hence, we also discuss the current challenges associated with Wnt-signalling/miRNAs complex and the analysis using the biomedical experimental and bioinformatics approaches. PMID:27590724

  3. Analysis methods for the determination of anthropogenic additions of P to agricultural soils

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Phosphorus additions and measurement in soil is of concern on lands where biosolids have been applied. Colorimetric analysis for plant-available P may be inadequate for the accurate assessment of soil P. Phosphate additions in a regulatory environment need to be accurately assessed as the reported...

  4. Identification of microRNA-regulated pathways using an integration of microRNA-mRNA microarray and bioinformatics analysis in CD34+ cells of myelodysplastic syndromes.

    PubMed

    Xu, Feng; Zhu, Yang; He, Qi; Wu, Ling-Yun; Zhang, Zheng; Shi, Wen-Hui; Liu, Li; Chang, Chun-Kang; Li, Xiao

    2016-01-01

    The effect of microRNA (miRNA) and targeted mRNA on signal transduction is not fully understood in myelodysplastic syndromes (MDS). Here, we tried to identify the miRNAs-regulated pathways through a combination of miRNA and mRNA microarray in CD34+ cells from MDS patients. We identified 34 differentially expressed miRNAs and 1783 mRNAs in MDS. 25 dysregulated miRNAs and 394 targeted mRNAs were screened by a combination of Pearson's correlation analysis and software prediction. Pathway analysis showed that several pathways such as Notch, PI3K/Akt might be regulated by those miRNA-mRNAs pairs. Through a combination of Pathway and miRNA-Gene or GO-Network analysis, miRNAs-regulated pathways, such as miR-195-5p/DLL1/Notch signaling pathway, were identified. Further qRT-PCR showed that miR-195-5p was up-regulated while DLL1 was down-regulated in patients with low-grade MDS compared with normal controls. Luciferase assay showed that DLL1 was a direct target of miR-195-5p. Overexpression of miR-195-5p led to increased cell apoptosis and reduced cell growth through inhibition of Notch signaling pathway. In conclusion, alteration expression of miRNAs and targeted mRNAs might have an important impact on cancer-related cellular pathways in MDS. Inhibition of Notch signaling pathway by miR-195-5p-DLL1 axis contributes to the excess apoptosis in low-grade MDS. PMID:27571714

  5. Identification of microRNA-regulated pathways using an integration of microRNA-mRNA microarray and bioinformatics analysis in CD34+ cells of myelodysplastic syndromes

    PubMed Central

    Xu, Feng; Zhu, Yang; He, Qi; Wu, Ling-Yun; Zhang, Zheng; Shi, Wen-Hui; Liu, Li; Chang, Chun-Kang; Li, Xiao

    2016-01-01

    The effect of microRNA (miRNA) and targeted mRNA on signal transduction is not fully understood in myelodysplastic syndromes (MDS). Here, we tried to identify the miRNAs-regulated pathways through a combination of miRNA and mRNA microarray in CD34+ cells from MDS patients. We identified 34 differentially expressed miRNAs and 1783 mRNAs in MDS. 25 dysregulated miRNAs and 394 targeted mRNAs were screened by a combination of Pearson’s correlation analysis and software prediction. Pathway analysis showed that several pathways such as Notch, PI3K/Akt might be regulated by those miRNA-mRNAs pairs. Through a combination of Pathway and miRNA-Gene or GO-Network analysis, miRNAs-regulated pathways, such as miR-195-5p/DLL1/Notch signaling pathway, were identified. Further qRT-PCR showed that miR-195-5p was up-regulated while DLL1 was down-regulated in patients with low-grade MDS compared with normal controls. Luciferase assay showed that DLL1 was a direct target of miR-195-5p. Overexpression of miR-195-5p led to increased cell apoptosis and reduced cell growth through inhibition of Notch signaling pathway. In conclusion, alteration expression of miRNAs and targeted mRNAs might have an important impact on cancer-related cellular pathways in MDS. Inhibition of Notch signaling pathway by miR-195-5p-DLL1 axis contributes to the excess apoptosis in low-grade MDS. PMID:27571714

  6. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    PubMed Central

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  7. Best practices in bioinformatics training for life scientists.

    PubMed

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  8. Best practices in bioinformatics training for life scientists.

    PubMed

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  9. Best practices in bioinformatics training for life scientists

    PubMed Central

    Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D.; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L.; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C.; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K.

    2013-01-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  10. A Guide to Bioinformatics for Immunologists

    PubMed Central

    Whelan, Fiona J.; Yap, Nicholas V. L.; Surette, Michael G.; Golding, G. Brian; Bowdish, Dawn M. E.

    2013-01-01

    Bioinformatics includes a suite of methods, which are cheap, approachable, and many of which are easily accessible without any sort of specialized bioinformatic training. Yet, despite this, bioinformatic tools are under-utilized by immunologists. Herein, we review a representative set of publicly available, easy-to-use bioinformatic tools using our own research on an under-annotated human gene, SCARA3, as an example. SCARA3 shares an evolutionary relationship with the class A scavenger receptors, but preliminary research showed that it was divergent enough that its function remained unclear. In our quest for more information about this gene – did it share gene sequence similarities to other scavenger receptors? Did it contain conserved protein domains? Where was it expressed in the human body? – we discovered the power and informative potential of publicly available bioinformatic tools designed for the novice in mind, which allowed us to hypothesize on the regulation, structure, and function of this protein. We argue that these tools are largely applicable to many facets of immunology research. PMID:24363654

  11. Isolation, characterization, and bioinformatic analysis of calmodulin-binding protein cmbB reveals a novel tandem IP22 repeat common to many Dictyostelium and Mimivirus proteins.

    PubMed

    O'Day, Danton H; Suhre, Karsten; Myre, Michael A; Chatterjee-Chakraborty, Munmun; Chavez, Sara E

    2006-08-01

    A novel calmodulin-binding protein cmbB from Dictyostelium discoideum is encoded in a single gene. Northern analysis reveals two cmbB transcripts first detectable at 4 h during multicellular development. Western blotting detects an approximately 46.6 kDa protein. Sequence analysis and calmodulin-agarose binding studies identified a "classic" calcium-dependent calmodulin-binding domain (179IPKSLRSLFLGKGYNQPLEF198) but structural analyses suggest binding may not involve classic alpha-helical calmodulin-binding. The cmbB protein is comprised of tandem repeats of a newly identified IP22 motif ([I,L]Pxxhxxhxhxxxhxxxhxxxx; where h = any hydrophobic amino acid) that is highly conserved and a more precise representation of the FNIP repeat. At least eight Acanthamoeba polyphaga Mimivirus proteins and over 100 Dictyostelium proteins contain tandem arrays of the IP22 motif and its variants. cmbB also shares structural homology to YopM, from the plague bacterium Yersenia pestis. PMID:16777069

  12. Bioinformatics and Microarray Analysis of miRNAs in Aged Female Mice Model Implied New Molecular Mechanisms for Impaired Fracture Healing

    PubMed Central

    He, Bing; Zhang, Zong-Kang; Liu, Jin; He, Yi-Xin; Tang, Tao; Li, Jie; Guo, Bao-Sheng; Lu, Ai-Ping; Zhang, Bao-Ting; Zhang, Ge

    2016-01-01

    Impaired fracture healing in aged females is still a challenge in clinics. MicroRNAs (miRNAs) play important roles in fracture healing. This study aims to identify the miRNAs that potentially contribute to the impaired fracture healing in aged females. Transverse femoral shaft fractures were created in adult and aged female mice. At post-fracture 0-, 2- and 4-week, the fracture sites were scanned by micro computed tomography to confirm that the fracture healing was impaired in aged female mice and the fracture calluses were collected for miRNA microarray analysis. A total of 53 significantly differentially expressed miRNAs and 5438 miRNA-target gene interactions involved in bone fracture healing were identified. A novel scoring system was designed to analyze the miRNA contribution to impaired fracture healing (RCIFH). Using this method, 11 novel miRNAs were identified to impair fracture healing at 2- or 4-week post-fracture. Thereafter, function analysis of target genes was performed for miRNAs with high RCIFH values. The results showed that high RCIFH miRNAs in aged female mice might impair fracture healing not only by down-regulating angiogenesis-, chondrogenesis-, and osteogenesis-related pathways, but also by up-regulating osteoclastogenesis-related pathway, which implied the essential roles of these high RCIFH miRNAs in impaired fracture healing in aged females, and might promote the discovery of novel therapeutic strategies. PMID:27527150

  13. Quantitative Analysis of Polymer Additives with MALDI-TOF MS Using an Internal Standard Approach

    NASA Astrophysics Data System (ADS)

    Schwarzinger, Clemens; Gabriel, Stefan; Beißmann, Susanne; Buchberger, Wolfgang

    2012-06-01

    MALDI-TOF MS is used for the qualitative analysis of seven different polymer additives directly from the polymer without tedious sample pretreatment. Additionally, by using a solid sample preparation technique, which avoids the concentration gradient problems known to occur with dried droplets and by adding tetraphenylporphyrine as an internal standard to the matrix, it is possible to perform quantitative analysis of additives directly from the polymer sample. Calibration curves for Tinuvin 770, Tinuvin 622, Irganox 1024, Irganox 1010, Irgafos 168, and Chimassorb 944 are presented, showing coefficients of determination between 0.911 and 0.990.

  14. Quantitative analysis of polymer additives with MALDI-TOF MS using an internal standard approach.

    PubMed

    Schwarzinger, Clemens; Gabriel, Stefan; Beißmann, Susanne; Buchberger, Wolfgang

    2012-06-01

    MALDI-TOF MS is used for the qualitative analysis of seven different polymer additives directly from the polymer without tedious sample pretreatment. Additionally, by using a solid sample preparation technique, which avoids the concentration gradient problems known to occur with dried droplets and by adding tetraphenylporphyrine as an internal standard to the matrix, it is possible to perform quantitative analysis of additives directly from the polymer sample. Calibration curves for Tinuvin 770, Tinuvin 622, Irganox 1024, Irganox 1010, Irgafos 168, and Chimassorb 944 are presented, showing coefficients of determination between 0.911 and 0.990.

  15. Adapting bioinformatics curricula for big data.

    PubMed

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs.

  16. Evolutionary and bioinformatic analysis of the spike glycoprotein gene of H120 vaccine strain protectotype of infectious bronchitis virus from India.

    PubMed

    Kamble, Nitin Machindra; Pillai, Aravind S; Gaikwad, Satish S; Shukla, Sanjeev Kumar; Khulape, Sagar Aashok; Dey, Sohini; Mohan, C Madhan

    2016-01-01

    The infectious bronchitis virus is a causative agent of avian infectious bronchitis (AIB), and is is an important disease that produces severe economic losses to the poultry industry worldwide. Recent AIB outbreaks in India have been associated with poor growth in broilers, drop in egg production, and thin egg shells in layers. The complete spike gene of Indian AIB vaccine strain was amplified and sequenced using a conventional reverse transcription polymerase chain reaction and is submitted to the GenBank (accession no KF188436). Phylogenetic analysis revealed that the vaccine strain currently used belongs to H120 genotype, an attenuated strain of Massachusetts (Mass) serotype. Nucleotide and amino acid sequence comparisons have shown that the reported spike gene from Indian isolates have 71.8%-99% and 71.4%-96.9% genetic similarity with the sequenced H120 strain. The study identifies live attenuated IBV vaccine strain, which is routinely used for vaccination, for the first time. Based on nucleotide and amino acid relatedness studies of the vaccine strain with reported IBV sequences from India, it is shown that the current vaccine strain is efficient in controlling the IBV infection. Continuous monitoring of IBV outbreaks by sequencing for genotyping and in vivo cross protection studies for serotyping is not only important for epidemiological investigation but also for evaluation of efficacy of the current vaccine. PMID:25311758

  17. The GMOD Drupal Bioinformatic Server Framework

    PubMed Central

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  18. Bioinformatic Analysis of Plasma Apolipoproteins A-I and A-II Revealed Unique Features of A-I/A-II HDL Particles in Human Plasma.

    PubMed

    Kido, Toshimi; Kurata, Hideaki; Kondo, Kazuo; Itakura, Hiroshige; Okazaki, Mitsuyo; Urata, Takeyoshi; Yokoyama, Shinji

    2016-01-01

    Plasma concentration of apoA-I, apoA-II and apoA-II-unassociated apoA-I was analyzed in 314 Japanese subjects (177 males and 137 females), including one (male) homozygote and 37 (20 males and 17 females) heterozygotes of genetic CETP deficiency. ApoA-I unassociated with apoA-II markedly and linearly increased with HDL-cholesterol, while apoA-II increased only very slightly and the ratio of apoA-II-associated apoA-I to apoA-II stayed constant at 2 in molar ratio throughout the increase of HDL-cholesterol, among the wild type and heterozygous CETP deficiency. Thus, overall HDL concentration almost exclusively depends on HDL with apoA-I without apoA-II (LpAI) while concentration of HDL containing apoA-I and apoA-II (LpAI:AII) is constant having a fixed molar ratio of 2 : 1 regardless of total HDL and apoA-I concentration. Distribution of apoA-I between LpAI and LpAI:AII is consistent with a model of statistical partitioning regardless of sex and CETP genotype. The analysis also indicated that LpA-I accommodates on average 4 apoA-I molecules and has a clearance rate indistinguishable from LpAI:AII. Independent evidence indicated LpAI:A-II has a diameter 20% smaller than LpAI, consistent with a model having two apoA-I and one apoA-II. The functional contribution of these particles is to be investigated. PMID:27526664

  19. Bioinformatic Analysis of Plasma Apolipoproteins A-I and A-II Revealed Unique Features of A-I/A-II HDL Particles in Human Plasma

    PubMed Central

    Kido, Toshimi; Kurata, Hideaki; Kondo, Kazuo; Itakura, Hiroshige; Okazaki, Mitsuyo; Urata, Takeyoshi; Yokoyama, Shinji

    2016-01-01

    Plasma concentration of apoA-I, apoA-II and apoA-II-unassociated apoA-I was analyzed in 314 Japanese subjects (177 males and 137 females), including one (male) homozygote and 37 (20 males and 17 females) heterozygotes of genetic CETP deficiency. ApoA-I unassociated with apoA-II markedly and linearly increased with HDL-cholesterol, while apoA-II increased only very slightly and the ratio of apoA-II-associated apoA-I to apoA-II stayed constant at 2 in molar ratio throughout the increase of HDL-cholesterol, among the wild type and heterozygous CETP deficiency. Thus, overall HDL concentration almost exclusively depends on HDL with apoA-I without apoA-II (LpAI) while concentration of HDL containing apoA-I and apoA-II (LpAI:AII) is constant having a fixed molar ratio of 2 : 1 regardless of total HDL and apoA-I concentration. Distribution of apoA-I between LpAI and LpAI:AII is consistent with a model of statistical partitioning regardless of sex and CETP genotype. The analysis also indicated that LpA-I accommodates on average 4 apoA-I molecules and has a clearance rate indistinguishable from LpAI:AII. Independent evidence indicated LpAI:A-II has a diameter 20% smaller than LpAI, consistent with a model having two apoA-I and one apoA-II. The functional contribution of these particles is to be investigated. PMID:27526664

  20. Structural and bioinformatic analysis of the kiwifruit allergen Act d 11, a member of the family of ripening-related proteins.

    PubMed

    Chruszcz, Maksymilian; Ciardiello, Maria Antonietta; Osinski, Tomasz; Majorek, Karolina A; Giangrieco, Ivana; Font, Jose; Breiteneder, Heimo; Thalassinos, Konstantinos; Minor, Wladek

    2013-12-01

    The allergen Act d 11, also known as kirola, is a 17 kDa protein expressed in large amounts in ripe green and yellow-fleshed kiwifruit. Ten percent of all kiwifruit-allergic individuals produce IgE specific for the protein. Using X-ray crystallography, we determined the first three-dimensional structures of Act d 11, produced from both recombinant expression in Escherichia coli and from the natural source (kiwifruit). While Act d 11 is immunologically correlated with the birch pollen allergen Bet v 1 and other members of the pathogenesis-related protein family 10 (PR-10), it has low sequence similarity to PR-10 proteins. By sequence Act d 11 appears instead to belong to the major latex/ripening-related (MLP/RRP) family, but analysis of the crystal structures shows that Act d 11 has a fold very similar to that of Bet v 1 and other PR-10 related allergens regardless of the low sequence identity. The structures of both the natural and recombinant protein include an unidentified ligand, which is relatively small (about 250 Da by mass spectrometry experiments) and most likely contains an aromatic ring. The ligand-binding cavity in Act d 11 is also significantly smaller than those in PR-10 proteins. The binding of the ligand, which we were not able to unambiguously identify, results in conformational changes in the protein that may have physiological and immunological implications. Interestingly, the residue corresponding to Glu45 in Bet v 1 (Glu46), which is important for IgE binding to the birch pollen allergen, is conserved in Act d 11, even though it is not in other allergens with significantly higher sequence identity to Bet v 1. We suggest that the so-called Gly-rich loop (or P-loop), which is conserved in all PR-10 allergens, may be responsible for IgE cross-reactivity between Bet v 1 and Act d 11.

  1. Structural bioinformatics analysis of enzymes involved in the biosynthesis pathway of the hypermodified nucleoside ms(2)io(6)A37 in tRNA.

    PubMed

    Kaminska, Katarzyna H; Baraniak, Urszula; Boniecki, Michal; Nowaczyk, Katarzyna; Czerwoniec, Anna; Bujnicki, Janusz M

    2008-01-01

    TRNAs from all organisms contain posttranscriptionally modified nucleosides, which are derived from the four canonical nucleosides. In most tRNAs that read codons beginning with U, adenosine in the position 37 adjacent to the 3' position of the anticodon is modified to N(6)-(Delta(2)-isopentenyl) adenosine (i(6)A). In many bacteria, such as Escherichia coli, this residue is typically hypermodified to N(6)-isopentenyl-2-thiomethyladenosine (ms(2)i(6)A). In a few bacteria, such as Salmonella typhimurium, ms(2)i(6)A can be further hydroxylated to N(6)-(cis-4-hydroxyisopentenyl)-2-thiomethyladenosine (ms(2)io(6)A). Although the enzymes that introduce the respective modifications (prenyltransferase MiaA, methylthiotransferase MiaB, and hydroxylase MiaE) have been identified, their structures remain unknown and sequence-function relationships remain obscure. We carried out sequence analysis and structure prediction of MiaA, MiaB, and MiaE, using the protein fold-recognition approach. Three-dimensional models of all three proteins were then built using a new modeling protocol designed to overcome uncertainties in the alignments and divergence between the templates. For MiaA and MiaB, the catalytic core was built based on the templates from the P-loop NTPase and Radical-SAM superfamilies, respectively. For MiaB, we have also modeled the C-terminal TRAM domain and the newly predicted N-terminal flavodoxin-fold domain. For MiaE, we confidently predict that it shares the three-dimensional fold with the ferritin-like four-helix bundle proteins and that it has a similar active site and mechanism of action to diiron carboxylate enzymes, in particular, methane monooxygenase (E.C.1.14.13.25) that catalyses the biological hydroxylation of alkanes. Our models provide the first structural platform for enzymes involved in the biosynthesis of i(6)A, ms(2)i(6)A, and ms(2)io(6)A, explain the data available from the literature and will help to design further experiments and interpret

  2. Macrobrachium rosenbergii mannose binding lectin: synthesis of MrMBL-N20 and MrMBL-C16 peptides and their antimicrobial characterization, bioinformatics and relative gene expression analysis.

    PubMed

    Arockiaraj, Jesu; Chaurasia, Mukesh Kumar; Kumaresan, Venkatesh; Palanisamy, Rajesh; Harikrishnan, Ramasamy; Pasupuleti, Mukesh; Kasi, Marimuthu

    2015-04-01

    Mannose-binding lectin (MBL), an antimicrobial protein, is an important component of innate immune system which recognizes repetitive sugar groups on the surface of bacteria and viruses leading to activation of the complement system. In this study, we reported a complete molecular characterization of cDNA encoded for MBL from freshwater prawn Macrobrachium rosenbergii (Mr). Two short peptides (MrMBL-N20: (20)AWNTYDYMKREHSLVKPYQG(39) and MrMBL-C16: (307)GGLFYVKHKEQQRKRF(322)) were synthesized from the MrMBL polypeptide. The purity of the MrMBL-N20 (89%) and MrMBL-C16 (93%) peptides were confirmed by MS analysis (MALDI-ToF). The purified peptides were used for further antimicrobial characterization including minimum inhibitory concentration (MIC) assay, kinetics of bactericidal efficiency and analysis of hemolytic capacity. The peptides exhibited antimicrobial activity towards all the Gram-negative bacteria taken for analysis, whereas they showed the activity towards only a few selected Gram-positive bacteria. MrMBL-C16 peptides produced the highest inhibition towards both the Gram-negative and Gram-positive bacteria compared to the MrMBL-N20. Both peptides do not produce any inhibition against Bacillus sps. The kinetics of bactericidal efficiency showed that the peptides drastically reduced the number of surviving bacterial colonies after 24 h incubation. The results of hemolytic activity showed that both peptides produced strong activity at higher concentration. However, MrMBL-C16 peptide produced the highest activity compared to the MrMBL-N20 peptide. Overall, the results indicated that the peptides can be used as bactericidal agents. The MrMBL protein sequence was characterized using various bioinformatics tools including phylogenetic analysis and structure prediction. We also reported the MrMBL gene expression pattern upon viral and bacterial infection in M. rosenbergii gills. It could be concluded that the prawn MBL may be one of the important molecule which

  3. Macrobrachium rosenbergii mannose binding lectin: synthesis of MrMBL-N20 and MrMBL-C16 peptides and their antimicrobial characterization, bioinformatics and relative gene expression analysis.

    PubMed

    Arockiaraj, Jesu; Chaurasia, Mukesh Kumar; Kumaresan, Venkatesh; Palanisamy, Rajesh; Harikrishnan, Ramasamy; Pasupuleti, Mukesh; Kasi, Marimuthu

    2015-04-01

    Mannose-binding lectin (MBL), an antimicrobial protein, is an important component of innate immune system which recognizes repetitive sugar groups on the surface of bacteria and viruses leading to activation of the complement system. In this study, we reported a complete molecular characterization of cDNA encoded for MBL from freshwater prawn Macrobrachium rosenbergii (Mr). Two short peptides (MrMBL-N20: (20)AWNTYDYMKREHSLVKPYQG(39) and MrMBL-C16: (307)GGLFYVKHKEQQRKRF(322)) were synthesized from the MrMBL polypeptide. The purity of the MrMBL-N20 (89%) and MrMBL-C16 (93%) peptides were confirmed by MS analysis (MALDI-ToF). The purified peptides were used for further antimicrobial characterization including minimum inhibitory concentration (MIC) assay, kinetics of bactericidal efficiency and analysis of hemolytic capacity. The peptides exhibited antimicrobial activity towards all the Gram-negative bacteria taken for analysis, whereas they showed the activity towards only a few selected Gram-positive bacteria. MrMBL-C16 peptides produced the highest inhibition towards both the Gram-negative and Gram-positive bacteria compared to the MrMBL-N20. Both peptides do not produce any inhibition against Bacillus sps. The kinetics of bactericidal efficiency showed that the peptides drastically reduced the number of surviving bacterial colonies after 24 h incubation. The results of hemolytic activity showed that both peptides produced strong activity at higher concentration. However, MrMBL-C16 peptide produced the highest activity compared to the MrMBL-N20 peptide. Overall, the results indicated that the peptides can be used as bactericidal agents. The MrMBL protein sequence was characterized using various bioinformatics tools including phylogenetic analysis and structure prediction. We also reported the MrMBL gene expression pattern upon viral and bacterial infection in M. rosenbergii gills. It could be concluded that the prawn MBL may be one of the important molecule which

  4. ebTrack: an environmental bioinformatics system built upon ArrayTrack™

    PubMed Central

    Chen, Minjun; Martin, Jackson; Fang, Hong; Isukapalli, Sastry; Georgopoulos, Panos G; Welsh, William J; Tong, Weida

    2009-01-01

    ebTrack is being developed as an integrated bioinformatics system for environmental research and analysis by addressing the issues of integration, curation, management, first level analysis and interpretation of environmental and toxicological data from diverse sources. It is based on enhancements to the US FDA developed ArrayTrack™ system through additional analysis modules for gene expression data as well as through incorporation and linkages to modules for analysis of proteomic and metabonomic datasets that include tandem mass spectra. ebTrack uses a client-server architecture with the free and open source PostgreSQL as its database engine, and java tools for user interface, analysis, visualization, and web-based deployment. Several predictive tools that are critical for environmental health research are currently supported in ebTrack, including Significance Analysis of Microarray (SAM). Furthermore, new tools are under continuous integration, and interfaces to environmental health risk analysis tools are being developed in order to make ebTrack widely usable. These health risk analysis tools include the Modeling ENvironment for TOtal Risk studies (MENTOR) for source-to-dose exposure modeling and the DOse Response Information ANalysis system (DORIAN) for health outcome modeling. The design of ebTrack is presented in detail and steps involved in its application are summarized through an illustrative application. PMID:19278561

  5. Bioinformatics: A History of Evolution "In Silico"

    ERIC Educational Resources Information Center

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  6. Bioinformatics in Undergraduate Education: Practical Examples

    ERIC Educational Resources Information Center

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  7. Bioboxes: standardised containers for interchangeable bioinformatics software.

    PubMed

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  8. Bioboxes: standardised containers for interchangeable bioinformatics software.

    PubMed

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable. PMID:26473029

  9. Implementing bioinformatic workflows within the bioextract server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  10. "Extreme Programming" in a Bioinformatics Class

    ERIC Educational Resources Information Center

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  11. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  12. 2010 Translational bioinformatics year in review

    PubMed Central

    Miller, Katharine S

    2011-01-01

    A review of 2010 research in translational bioinformatics provides much to marvel at. We have seen notable advances in personal genomics, pharmacogenetics, and sequencing. At the same time, the infrastructure for the field has burgeoned. While acknowledging that, according to researchers, the members of this field tend to be overly optimistic, the authors predict a bright future. PMID:21672905

  13. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    EPA Science Inventory

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  14. Navigating the changing learning landscape: perspective from bioinformatics.ca

    PubMed Central

    Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  15. [Highly efficient and rapid capillary electrophoretic analysis of seven organic acid additives in beverages using polymeric ionic liquid as additive].

    PubMed

    Han, Haifeng; Wang, Qing; Liu, Xi; Jiang, Shengxiang

    2012-05-01

    A new capillary electrophoretic method for the rapid and direct separation of seven organic acids in beverages was developed, with poly (1-vinyl-3-butylimidazolium bromide) as the reliable background electrolyte modifier to reverse the direction of anode electroosmotic flow (EOF) severely. Several factors that affected the separation efficiency were investigated in detail. The optimal running buffer consisted of 125 mmol/L sodium dihydrogen phosphate (pH 6.5) and 0.01 g/L poly (1-vinyl-3-butylimidazolium bromide). Highly efficient separation (105,000 to 636,000 plates/m) was achieved within 4 min and standard deviations of the migration times (n=3) were lower than 0.0213 min under optimal conditions. The limits of detection (S/N = 3) ranged from 0.001 to 0.05 g/L. The present method was applied to determine a beverage sample (Mirinda) for sodium citrate, benzoic acid and sorbic acid with concentration of 2.64, 0.10 and 0.08 g/L, respectively. The recoveries of the three analytes in the sample were 100.3%, 100.7% and 131.7%, respectively. The method is simple, rapid, inexpensive, and can be applied to determine organic acids as additives in beverages.

  16. Biochemical, Transcriptional, and Bioinformatic Analysis of Lipid Droplets from Seeds of Date Palm (Phoenix dactylifera L.) and Their Use as Potent Sequestration Agents against the Toxic Pollutant, 2,3,7,8-Tetrachlorinated Dibenzo-p-Dioxin.

    PubMed

    Hanano, Abdulsamie; Almousally, Ibrahem; Shaban, Mouhnad; Rahman, Farzana; Blee, Elizabeth; Murphy, Denis J

    2016-01-01

    Contamination of aquatic environments with dioxins, the most toxic group of persistent organic pollutants (POPs), is a major ecological issue. Dioxins are highly lipophilic and bioaccumulate in fatty tissues of marine organisms used for seafood where they constitute a potential risk for human health. Lipid droplets (LDs) purified from date palm, Phoenix dactylifera, seeds were characterized and their capacity to extract dioxins from aquatic systems was assessed. The bioaffinity of date palm LDs toward 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), the most toxic congener of dioxins was determined. Fractioned LDs were spheroidal with mean diameters of 2.5 µm, enclosing an oil-rich core of 392.5 mg mL(-1). Isolated LDs did not aggregate and/or coalesce unless placed in acidic media and were strongly associated with three major groups of polypeptides of relative mass 32-37, 20-24, and 16-18 kDa. These masses correspond to the LD-associated proteins, oleosins, caleosins, and steroleosins, respectively. Efficient partitioning of TCDD into LDs occurred with a coefficient of log K LB/w,TCDD = 7.528 ± 0.024; it was optimal at neutral pH and was dependent on the presence of the oil-rich core, but was independent of the presence of LD-associated proteins. Bioinformatic analysis of the date palm genome revealed nine oleosin-like, five caleosin-like, and five steroleosin-like sequences, with predicted structures having putative lipid-binding domains that match their LD stabilizing roles and use as bio-based encapsulation systems. Transcriptomic analysis of date palm seedlings exposed to TCDD showed strong up-regulation of several caleosin and steroleosin genes, consistent with increased LD formation. The results suggest that the plant LDs could be used in ecological remediation strategies to remove POPs from aquatic environments. Recent reports suggest that several fungal and algal species also use LDs to sequester both external and internally derived hydrophobic toxins, which

  17. Biochemical, Transcriptional, and Bioinformatic Analysis of Lipid Droplets from Seeds of Date Palm (Phoenix dactylifera L.) and Their Use as Potent Sequestration Agents against the Toxic Pollutant, 2,3,7,8-Tetrachlorinated Dibenzo-p-Dioxin.

    PubMed

    Hanano, Abdulsamie; Almousally, Ibrahem; Shaban, Mouhnad; Rahman, Farzana; Blee, Elizabeth; Murphy, Denis J

    2016-01-01

    Contamination of aquatic environments with dioxins, the most toxic group of persistent organic pollutants (POPs), is a major ecological issue. Dioxins are highly lipophilic and bioaccumulate in fatty tissues of marine organisms used for seafood where they constitute a potential risk for human health. Lipid droplets (LDs) purified from date palm, Phoenix dactylifera, seeds were characterized and their capacity to extract dioxins from aquatic systems was assessed. The bioaffinity of date palm LDs toward 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), the most toxic congener of dioxins was determined. Fractioned LDs were spheroidal with mean diameters of 2.5 µm, enclosing an oil-rich core of 392.5 mg mL(-1). Isolated LDs did not aggregate and/or coalesce unless placed in acidic media and were strongly associated with three major groups of polypeptides of relative mass 32-37, 20-24, and 16-18 kDa. These masses correspond to the LD-associated proteins, oleosins, caleosins, and steroleosins, respectively. Efficient partitioning of TCDD into LDs occurred with a coefficient of log K LB/w,TCDD = 7.528 ± 0.024; it was optimal at neutral pH and was dependent on the presence of the oil-rich core, but was independent of the presence of LD-associated proteins. Bioinformatic analysis of the date palm genome revealed nine oleosin-like, five caleosin-like, and five steroleosin-like sequences, with predicted structures having putative lipid-binding domains that match their LD stabilizing roles and use as bio-based encapsulation systems. Transcriptomic analysis of date palm seedlings exposed to TCDD showed strong up-regulation of several caleosin and steroleosin genes, consistent with increased LD formation. The results suggest that the plant LDs could be used in ecological remediation strategies to remove POPs from aquatic environments. Recent reports suggest that several fungal and algal species also use LDs to sequester both external and internally derived hydrophobic toxins, which

  18. Biochemical, Transcriptional, and Bioinformatic Analysis of Lipid Droplets from Seeds of Date Palm (Phoenix dactylifera L.) and Their Use as Potent Sequestration Agents against the Toxic Pollutant, 2,3,7,8-Tetrachlorinated Dibenzo-p-Dioxin

    PubMed Central

    Hanano, Abdulsamie; Almousally, Ibrahem; Shaban, Mouhnad; Rahman, Farzana; Blee, Elizabeth; Murphy, Denis J.

    2016-01-01

    Contamination of aquatic environments with dioxins, the most toxic group of persistent organic pollutants (POPs), is a major ecological issue. Dioxins are highly lipophilic and bioaccumulate in fatty tissues of marine organisms used for seafood where they constitute a potential risk for human health. Lipid droplets (LDs) purified from date palm, Phoenix dactylifera, seeds were characterized and their capacity to extract dioxins from aquatic systems was assessed. The bioaffinity of date palm LDs toward 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), the most toxic congener of dioxins was determined. Fractioned LDs were spheroidal with mean diameters of 2.5 µm, enclosing an oil-rich core of 392.5 mg mL-1. Isolated LDs did not aggregate and/or coalesce unless placed in acidic media and were strongly associated with three major groups of polypeptides of relative mass 32–37, 20–24, and 16–18 kDa. These masses correspond to the LD-associated proteins, oleosins, caleosins, and steroleosins, respectively. Efficient partitioning of TCDD into LDs occurred with a coefficient of log KLB/w,TCDD = 7.528 ± 0.024; it was optimal at neutral pH and was dependent on the presence of the oil-rich core, but was independent of the presence of LD-associated proteins. Bioinformatic analysis of the date palm genome revealed nine oleosin-like, five caleosin-like, and five steroleosin-like sequences, with predicted structures having putative lipid-binding domains that match their LD stabilizing roles and use as bio-based encapsulation systems. Transcriptomic analysis of date palm seedlings exposed to TCDD showed strong up-regulation of several caleosin and steroleosin genes, consistent with increased LD formation. The results suggest that the plant LDs could be used in ecological remediation strategies to remove POPs from aquatic environments. Recent reports suggest that several fungal and algal species also use LDs to sequester both external and internally derived hydrophobic toxins

  19. SOAP-based services provided by the European Bioinformatics Institute

    PubMed Central

    Pillai, S.; Silventoinen, V.; Kallio, K.; Senger, M.; Sobhany, S.; Tate, J.; Velankar, S.; Golovin, A.; Henrick, K.; Rice, P.; Stoehr, P.; Lopez, R.

    2005-01-01

    SOAP (Simple Object Access Protocol) () based Web Services technology () has gained much attention as an open standard enabling interoperability among applications across heterogeneous architectures and different networks. The European Bioinformatics Institute (EBI) is using this technology to provide robust data retrieval and data analysis mechanisms to the scientific community and to enhance utilization of the biological resources it already provides [N. Harte, V. Silventoinen, E. Quevillon, S. Robinson, K. Kallio, X. Fustero, P. Patel, P. Jokinen and R. Lopez (2004) Nucleic Acids Res., 32, 3–9]. These services are available free to all users from . PMID:15980463

  20. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    NASA Astrophysics Data System (ADS)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  1. Component-Based Approach for Educating Students in Bioinformatics

    ERIC Educational Resources Information Center

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  2. Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy

    NASA Astrophysics Data System (ADS)

    He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

    2015-02-01

    A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.

  3. Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy.

    PubMed

    He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

    2015-02-25

    A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.

  4. Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy.

    PubMed

    He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

    2015-02-25

    A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety. PMID:25300041

  5. Stimulation of terrestrial ecosystem carbon storage by nitrogen addition: a meta-analysis

    NASA Astrophysics Data System (ADS)

    Yue, Kai; Peng, Yan; Peng, Changhui; Yang, Wanqin; Peng, Xin; Wu, Fuzhong

    2016-01-01

    Elevated nitrogen (N) deposition alters the terrestrial carbon (C) cycle, which is likely to feed back to further climate change. However, how the overall terrestrial ecosystem C pools and fluxes respond to N addition remains unclear. By synthesizing data from multiple terrestrial ecosystems, we quantified the response of C pools and fluxes to experimental N addition using a comprehensive meta-analysis method. Our results showed that N addition significantly stimulated soil total C storage by 5.82% ([2.47%, 9.27%], 95% CI, the same below) and increased the C contents of the above- and below-ground parts of plants by 25.65% [11.07%, 42.12%] and 15.93% [6.80%, 25.85%], respectively. Furthermore, N addition significantly increased aboveground net primary production by 52.38% [40.58%, 65.19%] and litterfall by 14.67% [9.24%, 20.38%] at a global scale. However, the C influx from the plant litter to the soil through litter decomposition and the efflux from the soil due to microbial respiration and soil respiration showed insignificant responses to N addition. Overall, our meta-analysis suggested that N addition will increase soil C storage and plant C in both above- and below-ground parts, indicating that terrestrial ecosystems might act to strengthen as a C sink under increasing N deposition.

  6. Stimulation of terrestrial ecosystem carbon storage by nitrogen addition: a meta-analysis

    PubMed Central

    Yue, Kai; Peng, Yan; Peng, Changhui; Yang, Wanqin; Peng, Xin; Wu, Fuzhong

    2016-01-01

    Elevated nitrogen (N) deposition alters the terrestrial carbon (C) cycle, which is likely to feed back to further climate change. However, how the overall terrestrial ecosystem C pools and fluxes respond to N addition remains unclear. By synthesizing data from multiple terrestrial ecosystems, we quantified the response of C pools and fluxes to experimental N addition using a comprehensive meta-analysis method. Our results showed that N addition significantly stimulated soil total C storage by 5.82% ([2.47%, 9.27%], 95% CI, the same below) and increased the C contents of the above- and below-ground parts of plants by 25.65% [11.07%, 42.12%] and 15.93% [6.80%, 25.85%], respectively. Furthermore, N addition significantly increased aboveground net primary production by 52.38% [40.58%, 65.19%] and litterfall by 14.67% [9.24%, 20.38%] at a global scale. However, the C influx from the plant litter to the soil through litter decomposition and the efflux from the soil due to microbial respiration and soil respiration showed insignificant responses to N addition. Overall, our meta-analysis suggested that N addition will increase soil C storage and plant C in both above- and below-ground parts, indicating that terrestrial ecosystems might act to strengthen as a C sink under increasing N deposition. PMID:26813078

  7. YPED: An Integrated Bioinformatics Suite and Database for Mass Spectrometry-based Proteomics Research

    PubMed Central

    Colangelo, Christopher M.; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L.; Carriero, Nicholas J.; Gulcicek, Erol E.; Lam, TuKiet T.; Wu, Terence; Bjornson, Robert D.; Bruce, Can; Nairn, Angus C.; Rinehart, Jesse; Miller, Perry L.; Williams, Kenneth R.

    2015-01-01

    We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry (LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED’s database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. PMID:25712262

  8. On an Additive Semigraphoid Model for Statistical Networks With Application to Pathway Analysis

    PubMed Central

    Li, Bing; Chun, Hyonho; Zhao, Hongyu

    2014-01-01

    We introduce a nonparametric method for estimating non-gaussian graphical models based on a new statistical relation called additive conditional independence, which is a three-way relation among random vectors that resembles the logical structure of conditional independence. Additive conditional independence allows us to use one-dimensional kernel regardless of the dimension of the graph, which not only avoids the curse of dimensionality but also simplifies computation. It also gives rise to a parallel structure to the gaussian graphical model that replaces the precision matrix by an additive precision operator. The estimators derived from additive conditional independence cover the recently introduced nonparanormal graphical model as a special case, but outperform it when the gaussian copula assumption is violated. We compare the new method with existing ones by simulations and in genetic pathway analysis. PMID:26401064

  9. Analysis of occupational accidents: prevention through the use of additional technical safety measures for machinery

    PubMed Central

    Dźwiarek, Marek; Latała, Agata

    2016-01-01

    This article presents an analysis of results of 1035 serious and 341 minor accidents recorded by Poland's National Labour Inspectorate (PIP) in 2005–2011, in view of their prevention by means of additional safety measures applied by machinery users. Since the analysis aimed at formulating principles for the application of technical safety measures, the analysed accidents should bear additional attributes: the type of machine operation, technical safety measures and the type of events causing injuries. The analysis proved that the executed tasks and injury-causing events were closely connected and there was a relation between casualty events and technical safety measures. In the case of tasks consisting of manual feeding and collecting materials, the injuries usually occur because of the rotating motion of tools or crushing due to a closing motion. Numerous accidents also happened in the course of supporting actions, like removing pollutants, correcting material position, cleaning, etc. PMID:26652689

  10. Can bioinformatics help in the identification of moonlighting proteins?

    PubMed

    Hernández, Sergio; Calvo, Alejandra; Ferragut, Gabriela; Franco, Luís; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2014-12-01

    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place. PMID:25399591

  11. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    SciTech Connect

    Taylor, Ronald C.

    2010-12-21

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  12. Teaching the bioinformatics of signaling networks: an integrated approach to facilitate multi-disciplinary learning.

    PubMed

    Korcsmaros, Tamas; Dunai, Zsuzsanna A; Vellai, Tibor; Csermely, Peter

    2013-09-01

    The number of bioinformatics tools and resources that support molecular and cell biology approaches is continuously expanding. Moreover, systems and network biology analyses are accompanied more and more by integrated bioinformatics methods. Traditional information-centered university teaching methods often fail, as (1) it is impossible to cover all existing approaches in the frame of a single course, and (2) a large segment of the current bioinformation can become obsolete in a few years. Signaling network offers an excellent example for teaching bioinformatics resources and tools, as it is both focused and complex at the same time. Here, we present an outline of a university bioinformatics course with four sample practices to demonstrate how signaling network studies can integrate biochemistry, genetics, cell biology and network sciences. We show that several bioinformatics resources and tools, as well as important concepts and current trends, can also be integrated to signaling network studies. The research-type hands-on experiences we show enable the students to improve key competences such as teamworking, creative and critical thinking and problem solving. Our classroom course curriculum can be re-formulated as an e-learning material or applied as a part of a specific training course. The multi-disciplinary approach and the mosaic setup of the course have the additional benefit to support the advanced teaching of talented students.

  13. Bioinformatics approaches to cancer gene discovery.

    PubMed

    Narayanan, Ramaswamy

    2007-01-01

    The Cancer Gene Anatomy Project (CGAP) database of the National Cancer Institute has thousands of known and novel expressed sequence tags (ESTs). These ESTs, derived from diverse normal and tumor cDNA libraries, offer an attractive starting point for cancer gene discovery. Data-mining the CGAP database led to the identification of ESTs that were predicted to be specific to select solid tumors. Two genes from these efforts were taken to proof of concept for diagnostic and therapeutics indications of cancer. Microarray technology was used in conjunction with bioinformatics to understand the mechanism of one of the targets discovered. These efforts provide an example of gene discovery by using bioinformatics approaches. The strengths and weaknesses of this approach are discussed in this review.

  14. A toolbox for developing bioinformatics software

    PubMed Central

    Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.

    2012-01-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  15. Discovery and Classification of Bioinformatics Web Services

    SciTech Connect

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  16. Analysis of zinc in biological samples by flame atomic absorption spectrometry: use of addition calibration technique.

    PubMed

    Dutra, Rosilene L; Cantos, Geny A; Carasek, Eduardo

    2006-01-01

    The quantification of target analytes in complex matrices requires special calibration approaches to compensate for additional capacity or activity in the matrix samples. The standard addition is one of the most important calibration procedures for quantification of analytes in such matrices. However, this technique requires a great number of reagents and material, and it consumes a considerable amount of time throughout the analysis. In this work, a new calibration procedure to analyze biological samples is proposed. The proposed calibration, called the addition calibration technique, was used for the determination of zinc (Zn) in blood serum and erythrocyte samples. The results obtained were compared with those obtained using conventional calibration techniques (standard addition and standard calibration). The proposed addition calibration was validated by recovery tests using blood samples spiked with Zn. The range of recovery for blood serum and erythrocyte samples were 90-132% and 76-112%, respectively. Statistical studies among results obtained by the addition technique and conventional techniques, using a paired two-tailed Student's t-test and linear regression, demonstrated good agreement among them. PMID:16943611

  17. [Applied problems of mathematical biology and bioinformatics].

    PubMed

    Lakhno, V D

    2011-01-01

    Mathematical biology and bioinformatics represent a new and rapidly progressing line of investigations which emerged in the course of work on the project "Human genome". The main applied problems of these sciences are grug design, patient-specific medicine and nanobioelectronics. It is shown that progress in the technology of mass sequencing of the human genome has set the stage for starting the national program on patient-specific medicine.

  18. Broader incorporation of bioinformatics in education: opportunities and challenges.

    PubMed

    Cummings, Michael P; Temple, Glena G

    2010-11-01

    The major opportunities for broader incorporation of bioinformatics in education can be placed into three general categories: general applicability of bioinformatics in life science and related curricula; inherent fit of bioinformatics for promoting student learning in most biology programs; and the general experience and associated comfort students have with computers and technology. Conversely, the major challenges for broader incorporation of bioinformatics in education can be placed into three general categories: required infrastructure and logistics; instructor knowledge of bioinformatics and continuing education; and the breadth of bioinformatics, and the diversity of students and educational objectives. Broader incorporation of bioinformatics at all education levels requires overcoming the challenges to using transformative computer-requiring learning activities, assisting faculty in collecting assessment data on mastery of student learning outcomes, as well as creating more faculty development opportunities that span diverse skill levels, with an emphasis placed on providing resource materials that are kept up-to-date as the field and tools change.

  19. A linked series of laboratory exercises in molecular biology utilizing bioinformatics and GFP.

    PubMed

    Medin, Carey L; Nolin, Katie L

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce the importance of bioinformatics in molecular biology. Students employed multiprimer, site-directed mutagenesis to create variant colors from a plasmid expressing green fluorescent protein (GFP). Isolated mutant plasmid from Escherichia coli showing changes in fluorescence were sequenced. Students used sequence alignment tools, protein translator tools, protein modeling, and visualization to analyze the potential effect of their mutations within the protein structure. This laboratory linked molecular techniques and bioinformatics to promote and expand the understanding of experimental results in an upper-level undergraduate laboratory course. PMID:22081550

  20. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  1. Adapting bioinformatics curricula for big data

    PubMed Central

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  2. Chapter 16: Text Mining for Translational Bioinformatics

    PubMed Central

    Cohen, K. Bretonnel; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research—translating basic science results into new interventions—and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  3. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    PubMed Central

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  4. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.

    PubMed

    Mulder, Nicola J; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-02-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  5. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.

    PubMed

    Mulder, Nicola J; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-02-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet.

  6. 10 years for the Journal of Bioinformatics and Computational Biology (2003-2013) -- a retrospective.

    PubMed

    Eisenhaber, Frank; Sherman, Westley Arthur

    2014-06-01

    The Journal of Bioinformatics and Computational Biology (JBCB) started publishing scientific articles in 2003. It has established itself as home for solid research articles in the field (~ 60 per year) that are surprisingly well cited. JBCB has an important function as alternative publishing channel in addition to other, bigger journals.

  7. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    ERIC Educational Resources Information Center

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  8. Advancing standards for bioinformatics activities: persistence, reproducibility, disambiguation and Minimum Information About a Bioinformatics investigation (MIABi).

    PubMed

    Tan, Tin Wee; Tong, Joo Chuan; Khan, Asif M; de Silva, Mark; Lim, Kuan Siong; Ranganathan, Shoba

    2010-12-02

    The 2010 International Conference on Bioinformatics, InCoB2010, which is the annual conference of the Asia-Pacific Bioinformatics Network (APBioNet) has agreed to publish conference papers in compliance with the proposed Minimum Information about a Bioinformatics investigation (MIABi), proposed in June 2009. Authors of the conference supplements in BMC Bioinformatics, BMC Genomics and Immunome Research have consented to cooperate in this process, which will include the procedures described herein, where appropriate, to ensure data and software persistence and perpetuity, database and resource re-instantiability and reproducibility of results, author and contributor identity disambiguation and MIABi-compliance. Wherever possible, datasets and databases will be submitted to depositories with standardized terminologies. As standards are evolving, this process is intended as a prelude to the 100 BioDatabases (BioDB100) initiative whereby APBioNet collaborators will contribute exemplar databases to demonstrate the feasibility of standards-compliance and participate in refining the process for peer-review of such publications and validation of scientific claims and standards compliance. This testbed represents another step in advancing standards-based processes in the bioinformatics community which is essential to the growing interoperability of biological data, information, knowledge and computational resources.

  9. MOWServ: a web client for integration of bioinformatic resources

    PubMed Central

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try