Science.gov

Sample records for comparative structural bioinformatics

  1. A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Family Ectodomain Based on Phylogenetic Information

    PubMed Central

    Rentería, Miguel E.; Gandhi, Neha S.; Vinuesa, Pablo; Helmerhorst, Erik; Mancera, Ricardo L.

    2008-01-01

    The insulin receptor (IR), the insulin-like growth factor 1 receptor (IGF1R) and the insulin receptor-related receptor (IRR) are covalently-linked homodimers made up of several structural domains. The molecular mechanism of ligand binding to the ectodomain of these receptors and the resulting activation of their tyrosine kinase domain is still not well understood. We have carried out an amino acid residue conservation analysis in order to reconstruct the phylogeny of the IR Family. We have confirmed the location of ligand binding site 1 of the IGF1R and IR. Importantly, we have also predicted the likely location of the insulin binding site 2 on the surface of the fibronectin type III domains of the IR. An evolutionary conserved surface on the second leucine-rich domain that may interact with the ligand could not be detected. We suggest a possible mechanical trigger of the activation of the IR that involves a slight ‘twist’ rotation of the last two fibronectin type III domains in order to face the likely location of insulin. Finally, a strong selective pressure was found amongst the IRR orthologous sequences, suggesting that this orphan receptor has a yet unknown physiological role which may be conserved from amphibians to mammals. PMID:18989367

  2. NMR structure improvement: A structural bioinformatics & visualization approach

    NASA Astrophysics Data System (ADS)

    Block, Jeremy N.

    The overall goal of this project is to enhance the physical accuracy of individual models in macromolecular NMR (Nuclear Magnetic Resonance) structures and the realism of variation within NMR ensembles of models, while improving agreement with the experimental data. A secondary overall goal is to combine synergistically the best aspects of NMR and crystallographic methodologies to better illuminate the underlying joint molecular reality. This is accomplished by using the powerful method of all-atom contact analysis (describing detailed sterics between atoms, including hydrogens); new graphical representations and interactive tools in 3D and virtual reality; and structural bioinformatics approaches to the expanded and enhanced data now available. The resulting better descriptions of macromolecular structure and its dynamic variation enhances the effectiveness of the many biomedical applications that depend on detailed molecular structure, such as mutational analysis, homology modeling, molecular simulations, protein design, and drug design.

  3. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  4. Unraveling microalgal molecular interactions using evolutionary and structural bioinformatics.

    PubMed

    Vlachakis, Dimitrios; Pavlopoulou, Athanasia; Kazazi, Dorothea; Kossida, Sophia

    2013-10-10

    Microalgae are unicellular microorganisms indispensible for environmental stability and life on earth, because they produce approximately half of the atmospheric oxygen, with simultaneously feeding on the harmful greenhouse gas carbon dioxide. Using gene fusion analysis, a series of five fusion/fission events was identified, that provided the basis for critical insights to their evolutionary history. Moreover, the three-dimensional structures of both the fused and the component proteins were predicted, allowing us to envisage putative protein-protein interactions that are invaluable for the efficient usage, handling and exploitation of microalgae. Collectively, our proposed approach on the five fusion/fission alga protein events contributes towards the expansion of the microalgae knowledgebase, bridging protein evolution of the ancient microalgal species and the rapidly evolving, modern, bioinformatics field.

  5. Achievements and challenges in structural bioinformatics and computational biophysics

    PubMed Central

    Samish, Ilan; Bourne, Philip E.; Najmanovich, Rafael J.

    2015-01-01

    Motivation: The field of structural bioinformatics and computational biophysics has undergone a revolution in the last 10 years. Developments that are captured annually through the 3DSIG meeting, upon which this article reflects. Results: An increase in the accessible data, computational resources and methodology has resulted in an increase in the size and resolution of studied systems and the complexity of the questions amenable to research. Concomitantly, the parameterization and efficiency of the methods have markedly improved along with their cross-validation with other computational and experimental results. Conclusion: The field exhibits an ever-increasing integration with biochemistry, biophysics and other disciplines. In this article, we discuss recent achievements along with current challenges within the field. Contact: Rafael.Najmanovich@USherbrooke.ca PMID:25488929

  6. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    ERIC Educational Resources Information Center

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  7. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    PubMed

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  8. Comparative modeling of proteins: a method for engaging students' interest in bioinformatics tools.

    PubMed

    Badotti, Fernanda; Barbosa, Alan Sales; Reis, André Luiz Martins; do Valle, Italo Faria; Ambrósio, Lara; Bitar, Mainá

    2014-01-01

    The huge increase in data being produced in the genomic era has produced a need to incorporate computers into the research process. Sequence generation, its subsequent storage, interpretation, and analysis are now entirely computer-dependent tasks. Universities from all over the world have been challenged to seek a way of encouraging students to incorporate computational and bioinformatics skills since undergraduation in order to understand biological processes. The aim of this article is to report the experience of awakening students' interest in bioinformatics tools during a course focused on comparative modeling of proteins. The authors start by giving a full description of the course environmental context and students' backgrounds. Then they detail each class and present a general overview of the protein modeling protocol. The positive and negative aspects of the course are also reported, and some of the results generated in class and in projects outside the classroom are discussed. In the last section of the article, general perspectives about the course from students' point of view are given. This work can serve as a guide for professors who teach subjects for which bioinformatics tools are useful and for universities that plan to incorporate bioinformatics into the curriculum. PMID:24167006

  9. Human, vector and parasite Hsp90 proteins: A comparative bioinformatics analysis

    PubMed Central

    Faya, Ngonidzashe; Penkler, David L.; Tastan Bishop, Özlem

    2015-01-01

    The treatment of protozoan parasitic diseases is challenging, and thus identification and analysis of new drug targets is important. Parasites survive within host organisms, and some need intermediate hosts to complete their life cycle. Changing host environment puts stress on parasites, and often adaptation is accompanied by the expression of large amounts of heat shock proteins (Hsps). Among Hsps, Hsp90 proteins play an important role in stress environments. Yet, there has been little computational research on Hsp90 proteins to analyze them comparatively as potential parasitic drug targets. Here, an attempt was made to gain detailed insights into the differences between host, vector and parasitic Hsp90 proteins by large-scale bioinformatics analysis. A total of 104 Hsp90 sequences were divided into three groups based on their cellular localizations; namely cytosolic, mitochondrial and endoplasmic reticulum (ER). Further, the parasitic proteins were divided according to the type of parasite (protozoa, helminth and ectoparasite). Primary sequence analysis, phylogenetic tree calculations, motif analysis and physicochemical properties of Hsp90 proteins suggested that despite the overall structural conservation of these proteins, parasitic Hsp90 proteins have unique features which differentiate them from human ones, thus encouraging the idea that protozoan Hsp90 proteins should be further analyzed as potential drug targets. PMID:26793431

  10. XCluSim: a visual analytics tool for interactively comparing multiple clustering results of bioinformatics data

    PubMed Central

    2015-01-01

    Background Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them. However, such a comparison task with multiple clustering results is cognitively demanding and laborious. Results In this paper, we present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results based on the Visual Information Seeking Mantra. We build a taxonomy for categorizing existing techniques of clustering results visualization in terms of the Gestalt principles of grouping. Using the taxonomy, we choose the most appropriate interactive visualizations for presenting individual clustering results from different types of clustering algorithms. The efficacy of XCluSim is shown through case studies with a bioinformatician. Conclusions Compared to other relevant tools, XCluSim enables users to compare multiple clustering results in a more scalable manner. Moreover, XCluSim supports diverse clustering algorithms and dedicated visualizations and interactions for different types of clustering results, allowing more effective exploration of details on demand. Through case studies with a bioinformatics researcher, we received positive feedback on the functionalities of XCluSim, including its ability to help identify stably clustered items across multiple clustering results. PMID:26328893

  11. NETTAB 2014: From high-throughput structural bioinformatics to integrative systems biology.

    PubMed

    Romano, Paolo; Cordero, Francesca

    2016-03-02

    The fourteenth NETTAB workshop, NETTAB 2014, was devoted to a range of disciplines going from structural bioinformatics, to proteomics and to integrative systems biology. The topics of the workshop were centred around bioinformatics methods, tools, applications, and perspectives for models, standards and management of high-throughput biological data, structural bioinformatics, functional proteomics, mass spectrometry, drug discovery, and systems biology.43 scientific contributions were presented at NETTAB 2014, including keynote, special guest and tutorial talks, oral communications, and posters. Full papers from some of the best contributions presented at the workshop were later submitted to a special Call for this Supplement.Here, we provide an overview of the workshop and introduce manuscripts that have been accepted for publication in this Supplement.

  12. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function. PMID:21350909

  13. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function.

  14. AWSEM-MD: Protein Structure Prediction Using Coarse-grained Physical Potentials and Bioinformatically Based Local Structure Biasing

    PubMed Central

    Davtyan, Aram; Schafer, Nicholas P.; Zheng, Weihua; Clementi, Cecilia; Wolynes, Peter G.; Papoian, Garegin A.

    2012-01-01

    The Associative memory, Water mediated, Structure and Energy Model (AWSEM) is a coarse-grained protein force field. AWSEM contains physically motivated terms, such as hydrogen bonding, as well as a bioinformatically based local structure biasing term, which efficiently takes into account many-body effects that are modulated by the local sequence. When combined with appropriate local or global alignments to choose memories, AWSEM can be used to perform de novo protein structure prediction. Herein we present structure prediction results for a particular choice of local sequence alignment method based on short residue sequences called fragments. We demonstrate the model’s structure prediction capabilities for three levels of global homology between the target sequence and those proteins used for local structure biasing, all of which assume that the structure of the target sequence is not known. When there are no homologs in the database of structures used for local structure biasing, AWSEM calculations produce structural predictions that are somewhat improved compared with prior works using related approaches. The inclusion of a small number of structures from homologous sequences improves structure prediction only marginally but when the fragment search is restricted to only homologous sequences, AWSEM can perform high resolution structure prediction and can be used for kinetics and dynamics studies. PMID:22545654

  15. STRUCTURELAB: a heterogeneous bioinformatics system for RNA structure analysis.

    PubMed

    Shapiro, B A; Kasprzak, W

    1996-08-01

    STRUCTURELAB is a computational system that has been developed to permit the use of a broad array of approaches for the analysis of the structure of RNA. The goal of the development is to provide a large set of tools that can be well integrated with experimental biology to aid in the process of the determination of the underlying structure of RNA sequences. The approach taken views the structure determination problem as one of dealing with a database of many computationally generated structures and provides the capability to analyze this data set from different perspectives. Many algorithms are integrated into one system that also utilizes a heterogeneous computing approach permitting the use of several computer architectures to help solve the posed problems. These different computational platforms make it relatively easy to incorporate currently existing programs as well as newly developed algorithms and to best match these algorithms to the appropriate hardware. The system has been written in Common Lisp running on SUN or SGI Unix workstations, and it utilizes a network of participating machines defined in reconfigurable tables. A window-based interface makes this heterogeneous environment as transparent to the user as possible. PMID:9076633

  16. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines.

    PubMed

    D'Argenio, Valeria; Casaburi, Giorgio; Precone, Vincenza; Salvatore, Francesco

    2014-01-01

    Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times. PMID:24719854

  17. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines.

    PubMed

    D'Argenio, Valeria; Casaburi, Giorgio; Precone, Vincenza; Salvatore, Francesco

    2014-01-01

    Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times.

  18. MoStBioDat--molecular and structural bioinformatics database.

    PubMed

    Bak, Andrzej; Polanski, Jaroslaw; Stockner, Thomas; Kurczyk, Agata

    2010-05-01

    Computer simulations play a crucial role in contemporary chemical investigations, generating enormous amounts of data. The constraint of sharing data and results is regarded as a major impediment in drug discovery. Among the steepest barriers to overcome in the high throughput screening studies is the limited number of suitable, freely accessible repositories for storing drug and drug target data. By offering a uniform data storage and retrieval mechanism, various data might be compared and exchanged easily. This paper presents the stages of the MoStBioDat software platform development, originally designed for the efficient storage, management and access of SDF and PDB data. The detailed architecture and software implementation of this project are described, indicating also the disadvantages of the solutions chosen. The current implementation of the first prototype is written in Python, an open-source, high-level, object-oriented scripting language. The modular architecture of the package enables future extension with the necessary functionalities. The main objective of the MoStBioDat is to serve as an alternative, extensible open-source database derived partly from SDF and PDB files.

  19. Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery

    PubMed Central

    Blundell, Tom L; Sibanda, Bancinyane L; Montalvão, Rinaldo Wander; Brewerton, Suzanne; Chelliah, Vijayalakshmi; Worth, Catherine L; Harmer, Nicholas J; Davies, Owen; Burke, David

    2006-01-01

    Impressive progress in genome sequencing, protein expression and high-throughput crystallography and NMR has radically transformed the opportunities to use protein three-dimensional structures to accelerate drug discovery, but the quantity and complexity of the data have ensured a central place for informatics. Structural biology and bioinformatics have assisted in lead optimization and target identification where they have well established roles; they can now contribute to lead discovery, exploiting high-throughput methods of structure determination that provide powerful approaches to screening of fragment binding. PMID:16524830

  20. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    ERIC Educational Resources Information Center

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  1. Bioinformatics: promises and progress.

    PubMed

    Gupta, Shipra; Misra, Gauri; Khurana, S M Paul

    2015-01-01

    Bioinformatics is a multidisciplinary science that solves and analyzes biological problems. With the quantum explosion in biomedical data, the demand of bioinformatics has increased gradually. Present paper provides an overview of various ways through which the biologists or biological researchers in the domain of neurology, structural and functional biology, evolutionary biology, clinical science, etc., use bioinformatics applications for data analysis to summarise their research. A new perspective is used to classify the knowledge available in the field thus will help general audience to understand the application of bioinformatics.

  2. Bioinformatics approaches for structural and functional analysis of proteins in secondary metabolism in Withania somnifera.

    PubMed

    Sanchita; Singh, Swati; Sharma, Ashok

    2014-11-01

    Withania somnifera (Ashwagandha) is an affluent storehouse of large number of pharmacologically active secondary metabolites known as withanolides. These secondary metabolites are produced by withanolide biosynthetic pathway. Very less information is available on structural and functional aspects of enzymes involved in withanolides biosynthetic pathways of Withiana somnifera. We therefore performed a bioinformatics analysis to look at functional and structural properties of these important enzymes. The pathway enzymes taken for this study were 3-Hydroxy-3-methylglutaryl coenzyme A reductase, 1-Deoxy-D-xylulose-5-phosphate synthase, 1-Deoxy-D-xylulose-5-phosphate reductase, farnesyl pyrophosphate synthase, squalene synthase, squalene epoxidase, and cycloartenol synthase. The prediction of secondary structure was performed for basic structural information. Three-dimensional structures for these enzymes were predicted. The physico-chemical properties such as pI, AI, GRAVY and instability index were also studied. The current information will provide a platform to know the structural attributes responsible for the function of these protein until experimental structures become available.

  3. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    PubMed Central

    Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun

    2014-01-01

    While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285

  4. An Introductory Bioinformatics Exercise to Reinforce Gene Structure and Expression and Analyze the Relationship between Gene and Protein Sequences

    ERIC Educational Resources Information Center

    Almeida, Craig A.; Tardiff, Daniel F.; De Luca, Jane P.

    2004-01-01

    We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product.…

  5. Structural, bioinformatic, and in vivo analyses of two Treponema pallidum lipoproteins reveal a unique TRAP transporter

    PubMed Central

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-01-01

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP- independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP) and tp0958 (the symporter) are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of “tetratricopeptide repeat” (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPR-protein associated TRAP transporters (TPATs) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s). PMID:22306465

  6. Structural, Bioinformatic, and In Vivo Analyses of Two Treponema pallidum Lipoproteins Reveal a Unique TRAP Transporter

    SciTech Connect

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-05-25

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP-independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP), and tp0958 (the symporter), are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of 'tetratricopeptide repeat' (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPATs (TPR-protein-associated TRAP-Ts) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s).

  7. Structural and Phylogenetic Analysis of Laccases from Trichoderma: A Bioinformatic Approach

    PubMed Central

    Cázares-García, Saila Viridiana; Vázquez-Garcidueñas, Ma. Soledad; Vázquez-Marrufo, Gerardo

    2013-01-01

    The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential. PMID:23383142

  8. Edge Bioinformatics

    SciTech Connect

    Lo, Chien-Chi

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance

  9. Edge Bioinformatics

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in amore » genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance« less

  10. Structural Bioinformatics-Based Prediction of Exceptional Selectivity of p38 MAP Kinase Inhibitor PH-797804

    SciTech Connect

    Xing, Li; Shieh, Huey S.; Selness, Shaun R.; Devraj, Rajesh V.; Walker, John K.; Devadas, Balekudru; Hope, Heidi R.; Compton, Robert P.; Schindler, John F.; Hirsch, Jeffrey L.; Benson, Alan G.; Kurumbail, Ravi G.; Stegeman, Roderick A.; Williams, Jennifer M.; Broadus, Richard M.; Walden, Zara; Monahan, Joseph B.; Pfizer

    2009-07-24

    PH-797804 is a diarylpyridinone inhibitor of p38{alpha} mitogen-activated protein (MAP) kinase derived from a racemic mixture as the more potent atropisomer (aS), first proposed by molecular modeling and subsequently confirmed by experiments. On the basis of structural comparison with a different biaryl pyrazole template and supported by dozens of high-resolution crystal structures of p38{alpha} inhibitor complexes, PH-797804 is predicted to possess a high level of specificity across the broad human kinase genome. We used a structural bioinformatics approach to identify two selectivity elements encoded by the TXXXG sequence motif on the p38{alpha} kinase hinge: (i) Thr106 that serves as the gatekeeper to the buried hydrophobic pocket occupied by 2,4-difluorophenyl of PH-797804 and (ii) the bidentate hydrogen bonds formed by the pyridinone moiety with the kinase hinge requiring an induced 180{sup o} rotation of the Met109-Gly110 peptide bond. The peptide flip occurs in p38{alpha} kinase due to the critical glycine residue marked by its conformational flexibility. Kinome-wide sequence mining revealed rare presentation of the selectivity motif. Corroboratively, PH-797804 exhibited exceptionally high specificity against MAP kinases and the related kinases. No cross-reactivity was observed in large panels of kinase screens (selectivity ratio of >500-fold). In cellular assays, PH-797804 demonstrated superior potency and selectivity consistent with the biochemical measurements. PH-797804 has met safety criteria in human phase I studies and is under clinical development for several inflammatory conditions. Understanding the rationale for selectivity at the molecular level helps elucidate the biological function and design of specific p38{alpha} kinase inhibitors.

  11. A Bioinformatics Approach to the Structure, Function, and Evolution of the Nucleoprotein of the Order Mononegavirales

    PubMed Central

    Cleveland, Sean B.; Davies, John; McClure, Marcella A.

    2011-01-01

    The goal of this Bioinformatic study is to investigate sequence conservation in relation to evolutionary function/structure of the nucleoprotein of the order Mononegavirales. In the combined analysis of 63 representative nucleoprotein (N) sequences from four viral families (Bornaviridae, Filoviridae, Rhabdoviridae, and Paramyxoviridae) we predict the regions of protein disorder, intra-residue contact and co-evolving residues. Correlations between location and conservation of predicted regions illustrate a strong division between families while high- lighting conservation within individual families. These results suggest the conserved regions among the nucleoproteins, specifically within Rhabdoviridae and Paramyxoviradae, but also generally among all members of the order, reflect an evolutionary advantage in maintaining these sites for the viral nucleoprotein as part of the transcription/replication machinery. Results indicate conservation for disorder in the C-terminus region of the representative proteins that is important for interacting with the phosphoprotein and the large subunit polymerase during transcription and replication. Additionally, the C-terminus region of the protein preceding the disordered region, is predicted to be important for interacting with the encapsidated genome. Portions of the N-terminus are responsible for N∶N stability and interactions identified by the presence or lack of co-evolving intra-protein contact predictions. The validation of these prediction results by current structural information illustrates the benefits of the Disorder, Intra-residue contact and Compensatory mutation Correlator (DisICC) pipeline as a method for quickly characterizing proteins and providing the most likely residues and regions necessary to target for disruption in viruses that have little structural information available. PMID:21559282

  12. Structural templates for comparative protein docking

    PubMed Central

    Anishchenko, Ivan; Kundrotas, Petras J.; Tuzikov, Alexander V.; Vakser, Ilya A.

    2014-01-01

    Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, non-redundant library of templates containing 4,950 full structures of binary complexes and 5,936 protein-protein interfaces extracted from the full structures at 12Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu. PMID:25488330

  13. Structural templates for comparative protein docking.

    PubMed

    Anishchenko, Ivan; Kundrotas, Petras J; Tuzikov, Alexander V; Vakser, Ilya A

    2015-09-01

    Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, nonredundant library of templates containing 4950 full structures of binary complexes and 5936 protein-protein interfaces extracted from the full structures at 12 Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu.

  14. Unix interfaces, Kleisli, bucandin structure, etc. -- the heroic beginning of bioinformatics in Singapore.

    PubMed

    Eisenhaber, Frank

    2014-06-01

    Remarkably, Singapore as one of today's hotspots for bioinformatics and computational biology research appeared de novo out of pioneering efforts of engaged local individuals in the early 90-s that, supported with increasing public funds from 1996 on, morphed into the present vibrant research community. This article brings to mind the pioneers, their first successes and early institutional developments.

  15. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  16. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.

  17. Bioinformatics investigation of therapeutic mechanisms of Xuesaitong capsule treating ischemic cerebrovascular rat model with comparative transcriptome analysis

    PubMed Central

    Liao, Jiangquan; Wei, Benjun; Chen, Hengwen; Liu, Yongmei; Wang, Jie

    2016-01-01

    Background: Xuesaitong soft capsule (XST) which consists of panax notoginseng saponin (PNS) has been used to treat ischemic cerebrovascular diseases in China. The therapeutic mechanism of XST has not been elucidated yet from prospective of genomics and bioinformatics. Methods: A transcriptome analysis was performed to review series concerning middle cerebral artery occlusion (MCAO) rat model and XST intervention after MCAO from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were compared between blank group and model group, model group and XST group. Functional enrichment and pathway analysis were performed. Protein-Protein interaction network was constructed. The overlapping genes from two DEGs sets were screened out and profound analysis was performed. Results: Two series including 22 samples were obtained. 870 DEGs were identified between blank group and model group, and 1189 DEGs were identified between model group and XST group. GO terms and KEGG pathways of MCAO and XST intervention were significantly enriched. PPI networks were constructed to demonstrate the gene-gene interactions. The overlapping genes from two DEGs sets were highlighted. ANTXR2, FHL3, PRCP, TYROBP, TAF9B, FGFR2, BCL11B, RB1CC1 and MBNL2 were the pivotal genes and possible action sites of XST therapeutic mechanisms. Conclusion: MCAO is a pathological process with multiple. PMID:27347353

  18. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches.

    PubMed

    Ozyigit, Ibrahim I; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y; Koc, Ibrahim; Öztürk, Münir X; Anjum, Naser A

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  19. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches

    PubMed Central

    Ozyigit, Ibrahim I.; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y.; Koc, Ibrahim; Öztürk, Münir X.; Anjum, Naser A.

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  20. Bioinformatics based structural characterization of glucose dehydrogenase (gdh) gene and growth promoting activity of Leclercia sp. QAU-66

    PubMed Central

    Naveed, Muhammad; Ahmed, Iftikhar; Khalid, Nauman; Mumtaz, Abdul Samad

    2014-01-01

    Glucose dehydrogenase (GDH; EC 1.1. 5.2) is the member of quinoproteins group that use the redox cofactor pyrroloquinoline quinoine, calcium ions and glucose as substrate for its activity. In present study, Leclercia sp. QAU-66, isolated from rhizosphere of Vigna mungo, was characterized for phosphate solubilization and the role of GDH in plant growth promotion of Phaseolus vulgaris. The strain QAU-66 had ability to solubilize phosphorus and significantly (p ≤ 0.05) promoted the shoot and root lengths of Phaseolus vulgaris. The structural determination of GDH protein was carried out using bioinformatics tools like Pfam, InterProScan, I-TASSER and COFACTOR. These tools predicted the structural based functional homology of pyrroloquinoline quinone domains in GDH. GDH of Leclercia sp. QAU-66 is one of the main factor that involved in plant growth promotion and provides a solid background for further research in plant growth promoting activities. PMID:25242947

  1. Bioinformatic and functional analysis of RNA secondary structure elements among different genera of human and animal caliciviruses

    PubMed Central

    Simmonds, Peter; Karakasiliotis, Ioannis; Bailey, Dalan; Chaudhry, Yasmin; Evans, David J.; Goodfellow, Ian G.

    2008-01-01

    The mechanism and role of RNA structure elements in the replication and translation of Caliciviridae remains poorly understood. Several algorithmically independent methods were used to predict secondary structures within the Norovirus, Sapovirus, Vesivirus and Lagovirus genera. All showed profound suppression of synonymous site variability (SSSV) at genomic 5′ ends and the start of the sub-genomic (sg) transcript, consistent with evolutionary constraints from underlying RNA structure. A newly developed thermodynamic scanning method predicted RNA folding mapping precisely to regions of SSSV and at the genomic 3′ end. These regions contained several evolutionarily conserved RNA secondary structures, of variable size and positions. However, all caliciviruses contained 3′ terminal hairpins, and stem–loops in the anti-genomic strand invariably six bases upstream of the sg transcript, indicating putative roles as sg promoters. Using the murine norovirus (MNV) reverse-genetics system, disruption of 5′ end stem–loops produced ∼15- to 20-fold infectivity reductions, while disruption of the RNA structure in the sg promoter region and at the 3′ end entirely destroyed replication ability. Restoration of infectivity by repair mutations in the sg promoter region confirmed a functional role for the RNA secondary structure, not the sequence. This study provides comprehensive bioinformatic resources for future functional studies of MNV and other caliciviruses. PMID:18319285

  2. Structural Bioinformatics Inspection of neXtProt PE5 Proteins in the Human Proteome.

    PubMed

    Dong, Qiwen; Menon, Rajasree; Omenn, Gilbert S; Zhang, Yang

    2015-09-01

    One goal of the Human Proteome Project is to identify at least one protein product for each of the ∼20,000 human protein-coding genes. As of October 2014, however, there are 3564 genes (18%) that have no or insufficient evidence of protein existence (PE), as curated by neXtProt; these comprise 2647 PE2-4 missing proteins and 616 PE5 dubious protein entries. We conducted a systematic examination of the 616 PE5 protein entries using cutting-edge protein structure and function modeling methods. Compared to a random sample of high-confidence PE1 proteins, the putative PE5 proteins were found to be over-represented in the membrane and cell surface proteins and peptides fold families. Detailed functional analyses show that most PE5 proteins, if expressed, would belong to transporters and receptors localized in the plasma membrane compartment. The results suggest that experimental difficulty in identifying membrane-bound proteins and peptides could have precluded their detection in mass spectrometry and that special enrichment techniques with improved sensitivity for membrane proteins could be important for the characterization of the PE5 "dark matter" of the human proteome. Finally, we identify 66 high scoring PE5 protein entries and find that six of them were reported in recent mass spectrometry databases; an illustrative annotation of these six is provided. This work illustrates a new approach to examine the potential folding and function of the dubious proteins comprising PE5, which we will next apply to the far larger group of missing proteins comprising PE2-4.

  3. Elongation Factor-Tu (EF-Tu) proteins structural stability and bioinformatics in ancestral gene reconstruction

    NASA Astrophysics Data System (ADS)

    Dehipawala, Sunil; Nguyen, A.; Tremberger, G.; Cheung, E.; Schneider, P.; Lieberman, D.; Holden, T.; Cheung, T.

    2013-09-01

    A paleo-experimental evolution report on elongation factor EF-Tu structural stability results has provided an opportunity to rewind the tape of life using the ancestral protein sequence reconstruction modeling approach; consistent with the book of life dogma in current biology and being an important component in the astrobiology community. Fractal dimension via the Higuchi fractal method and Shannon entropy of the DNA sequence classification could be used in a diagram that serves as a simple summary. Results from biomedical gene research provide examples on the diagram methodology. Comparisons between biomedical genes such as EEF2 (elongation factor 2 human, mouse, etc), WDR85 in epigenetics, HAR1 in human specificity, DLG1 in cognitive skill, and HLA-C in mosquito bite immunology with EF Tu DNA sequences have accounted for the reported circular dichroism thermo-stability data systematically; the results also infer a relatively less volatility geologic time period from 2 to 3 Gyr from adaptation viewpoint. Comparison to Thermotoga maritima MSB8 and Psychrobacter shows that Thermus thermophilus HB8 EF-Tu calibration sequence could be an outlier, consistent with free energy calculation by NUPACK. Diagram methodology allows computer simulation studies and HAR1 shows about 0.5% probability from chimp to human in terms of diagram location, and SNP simulation results such as amoebic meningoencephalitis NAF1 suggest correlation. Extensions to the studies of the translation and transcription elongation factor sequences in Megavirus Chiliensis, Megavirus Lba and Pandoravirus show that the studied Pandoravirus sequence could be an outlier with the highest fractal dimension and lowest entropy, as compared to chicken as a deviant in the DNMT3A DNA methylation gene sequences from zebrafish to human and to the less than one percent probability in computer simulation using the HAR1 0.5% probability as reference. The diagram methodology would be useful in ancestral gene

  4. Minimal Functional Sites in Metalloproteins and Their Usage in Structural Bioinformatics

    PubMed Central

    Rosato, Antonio; Valasatava, Yana; Andreini, Claudia

    2016-01-01

    Metal ions play a functional role in numerous biochemical processes and cellular pathways. Indeed, about 40% of all enzymes of known 3D structure require a metal ion to be able to perform catalysis. The interactions of the metals with the macromolecular framework determine their chemical properties and reactivity. The relevant interactions involve both the coordination sphere of the metal ion and the more distant interactions of the so-called second sphere, i.e., the non-bonded interactions between the macromolecule and the residues coordinating the metal (metal ligands). The metal ligands and the residues in their close spatial proximity define what we call a minimal functional site (MFS). MFSs can be automatically extracted from the 3D structures of metal-binding biological macromolecules deposited in the Protein Data Bank (PDB). They are 3D templates that describe the local environment around a metal ion or metal cofactor and do not depend on the overall macromolecular structure. MFSs provide a different view on metal-binding proteins and nucleic acids, completely focused on the metal. Here we present different protocols and tools based upon the concept of MFS to obtain deeper insight into the structural and functional properties of metal-binding macromolecules. We also show that structure conservation of MFSs in metalloproteins relates to local sequence similarity more strongly than to overall protein similarity. PMID:27153067

  5. DOE EPSCoR Initiative in Structural and computational Biology/Bioinformatics

    SciTech Connect

    Wallace, Susan S.

    2008-02-21

    The overall goal of the DOE EPSCoR Initiative in Structural and Computational Biology was to enhance the competiveness of Vermont research in these scientific areas. To develop self-sustaining infrastructure, we increased the critical mass of faculty, developed shared resources that made junior researchers more competitive for federal research grants, implemented programs to train graduate and undergraduate students who participated in these research areas and provided seed money for research projects. During the time period funded by this DOE initiative: (1) four new faculty were recruited to the University of Vermont using DOE resources, three in Computational Biology and one in Structural Biology; (2) technical support was provided for the Computational and Structural Biology facilities; (3) twenty-two graduate students were directly funded by fellowships; (4) fifteen undergraduate students were supported during the summer; and (5) twenty-eight pilot projects were supported. Taken together these dollars resulted in a plethora of published papers, many in high profile journals in the fields and directly impacted competitive extramural funding based on structural or computational biology resulting in 49 million dollars awarded in grants (Appendix I), a 600% return on investment by DOE, the State and University.

  6. Crowdsourcing for bioinformatics

    PubMed Central

    Good, Benjamin M.; Su, Andrew I.

    2013-01-01

    Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614

  7. Structural and bioinformatic characterization of an Acinetobacter baumannii type II carrier protein

    SciTech Connect

    Allen, C. Leigh; Gulick, Andrew M.

    2014-06-01

    The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented. Microorganisms produce a variety of natural products via secondary metabolic biosynthetic pathways. Two of these types of synthetic systems, the nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), use large modular enzymes containing multiple catalytic domains in a single protein. These multidomain enzymes use an integrated carrier protein domain to transport the growing, covalently bound natural product to the neighboring catalytic domains for each step in the synthesis. Interestingly, some PKS and NRPS clusters contain free-standing domains that interact intermolecularly with other proteins. Being expressed outside the architecture of a multi-domain protein, these so-called type II proteins present challenges to understand the precise role they play. Additional structures of individual and multi-domain components of the NRPS enzymes will therefore provide a better understanding of the features that govern the domain interactions in these interesting enzyme systems. The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented here. Comparison with the closest structural homologs of other carrier proteins identifies the requirements for a conserved glycine residue and additional important sequence and structural requirements within the regions that interact with partner proteins.

  8. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15

    PubMed Central

    Wang, Jinlan; Chang, Fen

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein–protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  9. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15.

    PubMed

    Wang, Jinlan; Zhang, Zheng; Chang, Fen; Yin, Deling

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein-protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  10. The CopC Family: Structural and Bioinformatic Insights into a Diverse Group of Periplasmic Copper Binding Proteins.

    PubMed

    Lawton, Thomas J; Kenney, Grace E; Hurley, Joseph D; Rosenzweig, Amy C

    2016-04-19

    The CopC proteins are periplasmic copper binding proteins believed to play a role in bacterial copper homeostasis. Previous studies have focused on CopCs that are part of seven-protein Cop or Pco systems involved in copper resistance. These canonical CopCs contain distinct Cu(I) and Cu(II) binding sites. Mounting evidence suggests that CopCs are more widely distributed, often present only with the CopD inner membrane protein, frequently as a fusion protein, and that the CopC and CopD proteins together function in the uptake of copper to the cytoplasm. In the methanotroph Methylosinus trichosporium OB3b, genes encoding a CopCD pair are located adjacent to the particulate methane monooxygenase (pMMO) operon. The CopC from this organism (Mst-CopC) was expressed, purified, and structurally characterized. The 1.46 Å resolution crystal structure of Mst-CopC reveals a single Cu(II) binding site with coordination somewhat different from that in canonical CopCs, and the absence of a Cu(I) binding site. Extensive bioinformatic analyses indicate that the majority of CopCs in fact contain only a Cu(II) site, with just 10% of sequences corresponding to the canonical two-site CopC. Accordingly, a new classification scheme for CopCs was developed, and detailed analyses of the sequences and their genomic neighborhoods reveal new proteins potentially involved in copper homeostasis, providing a framework for expanded models of CopCD function.

  11. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes.

  12. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  13. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  14. [Bioinformatics: a key role in oncology].

    PubMed

    Olivier, Timothée; Chappuis, Pierre; Tsantoulis, Petros

    2016-05-18

    Bioinformatics is essential in clinical oncology and research. Combining biology, computer science and mathematics, bioinformatics aims to derive useful information from clinical and biological data, often poorly structured, at a large scale. Bioinformatics approaches have reclassified certain cancers based on their molecular and biological presentation, improving treatment selection. Many molecular signatures have been developed and, after validation, some are now usable in clinical practice. Other applications could facilitate daily practice, reduce the risk of error and increase the precision of medical decision-making. Bioinformatics must evolve in accordance with ethical considerations and requires multidisciplinary collaboration. Its application depends on a sound technical foundation that meets strict quality requirements.

  15. [Bioinformatics: a key role in oncology].

    PubMed

    Olivier, Timothée; Chappuis, Pierre; Tsantoulis, Petros

    2016-05-18

    Bioinformatics is essential in clinical oncology and research. Combining biology, computer science and mathematics, bioinformatics aims to derive useful information from clinical and biological data, often poorly structured, at a large scale. Bioinformatics approaches have reclassified certain cancers based on their molecular and biological presentation, improving treatment selection. Many molecular signatures have been developed and, after validation, some are now usable in clinical practice. Other applications could facilitate daily practice, reduce the risk of error and increase the precision of medical decision-making. Bioinformatics must evolve in accordance with ethical considerations and requires multidisciplinary collaboration. Its application depends on a sound technical foundation that meets strict quality requirements. PMID:27424424

  16. On comparing two structured RNA multiple alignments.

    PubMed

    Patel, Vandanaben; Wang, Jason T L; Setia, Shefali; Verma, Anurag; Warden, Charles D; Zhang, Kaizhong

    2010-12-01

    We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike many other tools that can perform pairwise alignment of either single sequences or structures only, BlockMatch takes into account the characteristics of all the sequences in the blocks along with their consensus structures during the alignment process, thus being able to achieve a high-quality alignment result. We apply BlockMatch to phylogeny reconstruction on a set of 5S rRNA sequences taken from fifteen bacteria species. Experimental results showed that the phylogenetic tree generated by our method is more accurate than the tree constructed based on the widely used ClustalW tool. The BlockMatch algorithm is implemented into a web server, accessible at http://bioinformatics.njit.edu/blockmatch. A jar file of the program is also available for download from the web server. PMID:21121021

  17. On comparing two structured RNA multiple alignments.

    PubMed

    Patel, Vandanaben; Wang, Jason T L; Setia, Shefali; Verma, Anurag; Warden, Charles D; Zhang, Kaizhong

    2010-12-01

    We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike many other tools that can perform pairwise alignment of either single sequences or structures only, BlockMatch takes into account the characteristics of all the sequences in the blocks along with their consensus structures during the alignment process, thus being able to achieve a high-quality alignment result. We apply BlockMatch to phylogeny reconstruction on a set of 5S rRNA sequences taken from fifteen bacteria species. Experimental results showed that the phylogenetic tree generated by our method is more accurate than the tree constructed based on the widely used ClustalW tool. The BlockMatch algorithm is implemented into a web server, accessible at http://bioinformatics.njit.edu/blockmatch. A jar file of the program is also available for download from the web server.

  18. Bioinformatics-Aided Venomics

    PubMed Central

    Kaas, Quentin; Craik, David J.

    2015-01-01

    Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future. PMID:26110505

  19. A bioinformatics approach for integrated transcriptomic and proteomic comparative analyses of model and non-sequenced anopheline vectors of human malaria parasites.

    PubMed

    Ubaida Mohien, Ceereena; Colquhoun, David R; Mathias, Derrick K; Gibbons, John G; Armistead, Jennifer S; Rodriguez, Maria C; Rodriguez, Mario Henry; Edwards, Nathan J; Hartler, Jürgen; Thallinger, Gerhard G; Graham, David R; Martinez-Barnetche, Jesus; Rokas, Antonis; Dinglasan, Rhoel R

    2013-01-01

    Malaria morbidity and mortality caused by both Plasmodium falciparum and Plasmodium vivax extend well beyond the African continent, and although P. vivax causes between 80 and 300 million severe cases each year, vivax transmission remains poorly understood. Plasmodium parasites are transmitted by Anopheles mosquitoes, and the critical site of interaction between parasite and host is at the mosquito's luminal midgut brush border. Although the genome of the "model" African P. falciparum vector, Anopheles gambiae, has been sequenced, evolutionary divergence limits its utility as a reference across anophelines, especially non-sequenced P. vivax vectors such as Anopheles albimanus. Clearly, technologies and platforms that bridge this substantial scientific gap are required in order to provide public health scientists with key transcriptomic and proteomic information that could spur the development of novel interventions to combat this disease. To our knowledge, no approaches have been published that address this issue. To bolster our understanding of P. vivax-An. albimanus midgut interactions, we developed an integrated bioinformatic-hybrid RNA-Seq-LC-MS/MS approach involving An. albimanus transcriptome (15,764 contigs) and luminal midgut subproteome (9,445 proteins) assembly, which, when used with our custom Diptera protein database (685,078 sequences), facilitated a comparative proteomic analysis of the midgut brush borders of two important malaria vectors, An. gambiae and An. albimanus.

  20. Bioinformatic and Comparative Localization of Rab Proteins Reveals Functional Insights into the Uncharacterized GTPases Ypt10p and Ypt11p†

    PubMed Central

    Buvelot Frei, Stéphanie; Rahl, Peter B.; Nussbaum, Maria; Briggs, Benjamin J.; Calero, Monica; Janeczko, Stephanie; Regan, Andrew D.; Chen, Catherine Z.; Barral, Yves; Whittaker, Gary R.; Collins, Ruth N.

    2006-01-01

    A striking characteristic of a Rab protein is its steady-state localization to the cytosolic surface of a particular subcellular membrane. In this study, we have undertaken a combined bioinformatic and experimental approach to examine the evolutionary conservation of Rab protein localization. A comprehensive primary sequence classification shows that 10 out of the 11 Rab proteins identified in the yeast (Saccharomyces cerevisiae) genome can be grouped within a major subclass, each comprising multiple Rab orthologs from diverse species. We compared the locations of individual yeast Rab proteins with their localizations following ectopic expression in mammalian cells. Our results suggest that green fluorescent protein-tagged Rab proteins maintain localizations across large evolutionary distances and that the major known player in the Rab localization pathway, mammalian Rab-GDI, is able to function in yeast. These findings enable us to provide insight into novel gene functions and classify the uncharacterized Rab proteins Ypt10p (YBR264C) as being involved in endocytic function and Ypt11p (YNL304W) as being localized to the endoplasmic reticulum, where we demonstrate it is required for organelle inheritance. PMID:16980630

  1. Channelrhodopsins: a bioinformatics perspective.

    PubMed

    Del Val, Coral; Royuela-Flor, José; Milenkovic, Stefan; Bondar, Ana-Nicoleta

    2014-05-01

    Channelrhodopsins are microbial-type rhodopsins that function as light-gated cation channels. Understanding how the detailed architecture of the protein governs its dynamics and specificity for ions is important, because it has the potential to assist in designing site-directed channelrhodopsin mutants for specific neurobiology applications. Here we use bioinformatics methods to derive accurate alignments of channelrhodopsin sequences, assess the sequence conservation patterns and find conserved motifs in channelrhodopsins, and use homology modeling to construct three-dimensional structural models of channelrhodopsins. The analyses reveal that helices C and D of channelrhodopsins contain Cys, Ser, and Thr groups that can engage in both intra- and inter-helical hydrogen bonds. We propose that these polar groups participate in inter-helical hydrogen-bonding clusters important for the protein conformational dynamics and for the local water interactions. This article is part of a Special Issue entitled: Retinal Proteins - You can teach an old dog new tricks. PMID:24252597

  2. Bioinformatics and Moonlighting Proteins.

    PubMed

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein-protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations - it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  3. Computational intelligence techniques in bioinformatics.

    PubMed

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included. PMID:23891719

  4. Biggest challenges in bioinformatics

    PubMed Central

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event. PMID:23492829

  5. Bioinformatics Visualisation Tools: An Unbalanced Picture.

    PubMed

    Broască, Laura; Ancuşa, Versavia; Ciocârlie, Horia

    2016-01-01

    Visualization tools represent a key element in triggering human creativity while being supported with the analysis power of the machine. This paper analyzes free network visualization tools for bioinformatics, frames them in domain specific requirements and compares them. PMID:27577488

  6. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    PubMed

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-01

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.

  7. Bioinformatics education in India.

    PubMed

    Kulkarni-Kale, Urmila; Sawant, Sangeeta; Chavan, Vishwas

    2010-11-01

    An account of bioinformatics education in India is presented along with future prospects. Establishment of BTIS network by Department of Biotechnology (DBT), Government of India in the 1980s had been a systematic effort in the development of bioinformatics infrastructure in India to provide services to scientific community. Advances in the field of bioinformatics underpinned the need for well-trained professionals with skills in information technology and biotechnology. As a result, programmes for capacity building in terms of human resource development were initiated. Educational programmes gradually evolved from the organisation of short-term workshops to the institution of formal diploma/degree programmes. A case study of the Master's degree course offered at the Bioinformatics Centre, University of Pune is discussed. Currently, many universities and institutes are offering bioinformatics courses at different levels with variations in the course contents and degree of detailing. BioInformatics National Certification (BINC) examination initiated in 2005 by DBT provides a common yardstick to assess the knowledge and skill sets of students passing out of various institutions. The potential for broadening the scope of bioinformatics to transform it into a data intensive discovery discipline is discussed. This necessitates introduction of amendments in the existing curricula to accommodate the upcoming developments.

  8. Comparative Protein Structure Modeling Using MODELLER.

    PubMed

    Webb, Benjamin; Sali, Andrej

    2014-09-08

    Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.

  9. Bioinformatics meets clinical informatics.

    PubMed

    Smith, Jeremy; Protti, Denis

    2005-01-01

    The field of bioinformatics has exploded over the past decade. Hopes have run high for the impact on preventive, diagnostic, and therapeutic capabilities of genomics and proteomics. As time has progressed, so has our understanding of this field. Although the mapping of the human genome will certainly have an impact on health care, it is a complex web to unweave. Addressing simpler "Single Nucleotide Polymorphisms" (SNPs) is not new, however, the complexity and importance of polygenic disorders and the greater role of the far more complex field of proteomics has become more clear. Proteomics operates much closer to the actual cellular level of human structure and proteins are very sensitive markers of health. Because the proteome, however, is so much more complex than the genome, and changes with time and environmental factors, mapping it and using the data in direct care delivery is even harder than for the genome. For these reasons of complexity, the expected utopia of a single gene chip or protein chip capable of analyzing an individual's genetic make-up and producing a cornucopia of useful diagnostic information appears still a distant hope. When, and if, this happens, perhaps a genetic profile of each individual will be stored with their medical record; however, in the mean time, this type of information is unlikely to prove highly useful on a broad scale. To address the more complex "polygenic" diseases and those related to protein variations, other tools will be developed in the shorter term. "Top-down" analysis of populations and diseases is likely to produce earlier wins in this area. Detailed computer-generated models will map a wide array of human and environmental factors that indicate the presence of a disease or the relative impact of a particular treatment. These models may point to an underlying genomic or proteomic cause, for which genomic or proteomic testing or therapies could then be applied for confirmation and/or treatment. These types of

  10. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    PubMed Central

    Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  11. A survey of scholarly literature describing the field of bioinformatics education and bioinformatics educational research.

    PubMed

    Magana, Alejandra J; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students' attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  12. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  13. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  14. Comparative Protein Structure Modeling Using MODELLER.

    PubMed

    Webb, Benjamin; Sali, Andrej

    2016-01-01

    Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc. PMID:27322406

  15. Comparative Protein Structure Modeling Using MODELLER.

    PubMed

    Webb, Benjamin; Sali, Andrej

    2016-06-20

    Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.

  16. GlycoMinestruct: a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features

    PubMed Central

    Li, Fuyi; Li, Chen; Revote, Jerico; Zhang, Yang; Webb, Geoffrey I.; Li, Jian; Song, Jiangning; Lithgow, Trevor

    2016-01-01

    Glycosylation plays an important role in cell-cell adhesion, ligand-binding and subcellular recognition. Current approaches for predicting protein glycosylation are primarily based on sequence-derived features, while little work has been done to systematically assess the importance of structural features to glycosylation prediction. Here, we propose a novel bioinformatics method called GlycoMinestruct(http://glycomine.erc.monash.edu/Lab/GlycoMine_Struct/) for improved prediction of human N- and O-linked glycosylation sites by combining sequence and structural features in an integrated computational framework with a two-step feature-selection strategy. Experiments indicated that GlycoMinestruct outperformed NGlycPred, the only predictor that incorporated both sequence and structure features, achieving AUC values of 0.941 and 0.922 for N- and O-linked glycosylation, respectively, on an independent test dataset. We applied GlycoMinestruct to screen the human structural proteome and obtained high-confidence predictions for N- and O-linked glycosylation sites. GlycoMinestruct can be used as a powerful tool to expedite the discovery of glycosylation events and substrates to facilitate hypothesis-driven experimental studies. PMID:27708373

  17. Systematic, map-scale, comparative structural geology

    SciTech Connect

    Groshong, R.H. Jr.

    1985-01-01

    Interpretation by analogy is the basis of comparative structural geology. A systematic approach to analog selection aids in efficiency and in understanding. The basic interpretive unit for analog selection is the structural family: a map-scale assemblage of genetically related structural forms produced by deformation with approximately constant boundary conditions. A family is specified by the dominant component of its displacement field and by structural levels involved. The differential vertical displacement category includes intrusive and impact structures. The three important basement types are isotropic crystalline, quasisedimentary and metamorphosing. A family is either thin skinned or involves cover plus one of the three basement types. These parameters are arranged into a matrix to produce 20 pigeon holes. Some structures do not fall exactly into one pigeon hole. Other structures link two families; for example, gravity glide links thin-skinned extension and contraction. This system is analogous to end-member rock classifications. Not every example is an end member, but the concept of end members greatly speeds up comparative analysis and clarifies the choice of analogies. Future research will lead to better definition of the key characteristics of certain families, the relationships between families, and the possible existence of additional families.

  18. Bioinformatic evidence for a stem-loop structure 5'-adjacent to the IGR-IRES and for an overlapping gene in the bee paralysis dicistroviruses

    PubMed Central

    Firth, Andrew E; Wang, Qing S; Jan, Eric; Atkins, John F

    2009-01-01

    The family Dicistroviridae (order Picornavirales) includes species that infect insects and other arthropods. These viruses have a linear positive-sense ssRNA genome of ~8-10 kb, which contains two long ORFs. The 5' ORF encodes the nonstructural polyprotein while the 3' ORF encodes the structural polyprotein. The dicistroviruses are noteworthy for the intergenic Internal Ribosome Entry Site (IGR-IRES) that mediates efficient translation initation on the 3' ORF without the requirement for initiator Met-tRNA. Acute bee paralysis virus, Israel acute paralysis virus of bees and Kashmir bee virus form a distinct subgroup within the Dicistroviridae family. In this brief report, we describe the bioinformatic discovery of a new, apparently coding, ORF in these viruses. The ORF overlaps the 5' end of the structural polyprotein coding sequence in the +1 reading frame. We also identify a potential 14-18 bp RNA stem-loop structure 5'-adjacent to the IGR-IRES. We discuss potential translation initiation mechanisms for the novel ORF in the context of the IGR-IRES and 5'-adjacent stem-loop. PMID:19895695

  19. BioWarehouse: a bioinformatics database warehouse toolkit

    PubMed Central

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

    2006-01-01

    Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for

  20. An Online Bioinformatics Curriculum

    PubMed Central

    Searls, David B.

    2012-01-01

    Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field. PMID:23028269

  1. Bioinformatics software resources.

    PubMed

    Gilbert, Don

    2004-09-01

    This review looks at internet archives, repositories and lists for obtaining popular and useful biology and bioinformatics software. Resources include collections of free software, services for the collaborative development of new programs, software news media and catalogues of links to bioinformatics software and web tools. Problems with such resources arise from needs for continued curator effort to collect and update these, combined with less than optimal community support, funding and collaboration. Despite some problems, the available software repositories provide needed public access to many tools that are a foundation for analyses in bioscience research efforts.

  2. Comparing anisotropic displacement parameters in protein structures.

    PubMed

    Merritt, E A

    1999-12-01

    The increasingly widespread use of synchrotron-radiation sources and cryo-preparation of samples in macromolecular crystallography has led to a dramatic increase in the number of macromolecular structures determined at atomic or near-atomic resolution. This permits expansion of the structural model to include anisotropic displacement parameters U(ij) for individual atoms. In order to explore the physical significance of these parameters in protein structures, it is useful to be able to compare quantitatively the electron-density distribution described by the refined U(ij) values associated with corresponding crystallographically independent atoms. This paper presents the derivation of an easily calculated correlation coefficient in real space between two atoms modeled with anisotropic displacement parameters. This measure is used to investigate the degree of similarity between chemically equivalent but crystallographically independent atoms in the set of protein structural models currently available from the Protein Data Bank.

  3. Bioinformatics and School Biology

    ERIC Educational Resources Information Center

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  4. Towards a career in bioinformatics

    PubMed Central

    2009-01-01

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508

  5. An Inquiry into Protein Structure and Genetic Disease: Introducing Undergraduates to Bioinformatics in a Large Introductory Course

    ERIC Educational Resources Information Center

    Bednarski, April E.; Elgin, Sarah C. R.; Pakrasi, Himadri B.

    2005-01-01

    This inquiry-based lab is designed around genetic diseases with a focus on protein structure and function. To allow students to work on their own investigatory projects, 10 projects on 10 different proteins were developed. Students are grouped in sections of 20 and work in pairs on each of the projects. To begin their investigation, students are…

  6. Structural elements of the mitochondrial preprotein-conducting channel Tom40 dissolved by bioinformatics and mass spectrometry.

    PubMed

    Gessmann, Dennis; Flinner, Nadine; Pfannstiel, Jens; Schlösinger, Andrea; Schleiff, Enrico; Nussberger, Stephan; Mirus, Oliver

    2011-12-01

    Most mitochondrial proteins are imported into mitochondria from the cytosolic compartment. Proteins destined for the outer or inner membrane, the inter-membrane space, or the matrix are recognized and translocated by the TOM machinery containing the specialized protein import channel Tom40. The latter is a protein with β-barrel shape, which is suggested to have evolved from a porin-type protein. To obtain structural insights in the absence of a crystal structure the membrane topology of Tom40 from Neurospora crassa was determined by limited proteolysis combined with mass spectrometry. The results were interpreted on the basis of a structural model that has been generated for NcTom40 by using the structure of mouse VDAC-1 as a template and amino acid sequence information of approximately 270 different Tom40 and approximately 480 VDAC amino acid sequences for refinement. The model largely explains the observed accessible cleavage sites and serves as a structural basis for the investigation of physicochemical properties of the ensemble of our Tom40 sequence data set. By this means we discovered two conserved polar slides in the pore interior. One is possibly involved in the positioning of a pore-inserted helix; the other one might be important for mitochondrial pre-sequence peptide binding as it is only present in Tom40 but not in VDAC proteins. The outer surface of the Tom40 barrel reveals two conserved amino acid clusters. They may be involved in binding other components of the TOM complex or bridging components of the TIM machinery of the mitochondrial inner membrane.

  7. Bioinformatics Analysis Reveals Abundant Short Alpha-Helices as a Common Structural Feature of Oomycete RxLR Effector Proteins

    PubMed Central

    Ye, Wenwu; Wang, Yang; Wang, Yuanchao

    2015-01-01

    RxLR effectors represent one of the largest and most diverse effector families in oomycete plant pathogens. These effectors have attracted enormous attention since they can be delivered inside the plant cell and manipulates host immunity. With the exceptions of a signal peptide and the following RxLR-dEER and C-terminal W/Y/L motifs identified from the sequences themselves, nearly no functional domains have been found. Recently, protein structures of several RxLRs were revealed to comprise alpha-helical bundle repeats. However, approximately half of all RxLRs lack obvious W/Y/L motifs, which are associated with helical structures. In this study, secondary structure prediction of the putative RxLR proteins was performed. We found that the C-terminus of the majority of these RxLR proteins, irrespective of the presence of W/Y/L motifs, contains abundant short alpha-helices. Since a large-scale experimental determination of protein structures has been difficult to date, results of the current study extend our understanding on the oomycete RxLR effectors in protein secondary structures from individual members to the entire family. Moreover, we identified less alpha-helix-rich proteins from secretomes of several oomycete and fungal organisms in which RxLRs have not been identified, providing additional evidence that these organisms are unlikely to harbor RxLR-like proteins. Therefore, these results provide additional information that will aid further studies on the evolution and functional mechanisms of RxLR effectors. PMID:26252511

  8. Highlighting computations in bioscience and bioinformatics: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB07)

    PubMed Central

    Lu, Guoqing; Ni, Jun

    2008-01-01

    The Second Symposium on Computations in Bioinformatics and Bioscience (SCBB07) was held in Iowa City, Iowa, USA, on August 13–15, 2007. This annual event attracted dozens of bioinformatics professionals and students, who are interested in solving emerging computational problems in bioscience, from China, Japan, Taiwan and the United States. The Scientific Committee of the symposium selected 18 peer-reviewed papers for publication in this supplemental issue of BMC Bioinformatics. These papers cover a broad spectrum of topics in computational biology and bioinformatics, including DNA, protein and genome sequence analysis, gene expression and microarray analysis, computational proteomics and protein structure classification, systems biology and machine learning. PMID:18541044

  9. Bioinformatics pipeline for functional identification and characterization of proteins

    NASA Astrophysics Data System (ADS)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; Krzywkowski, Tomasz; Świerkula, Katarzyna; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    The new sequencing methods, called Next Generation Sequencing gives an opportunity to possess a vast amount of data in short time. This data requires structural and functional annotation. Functional identification and characterization of predicted proteins could be done by in silico approches, thanks to a numerous computational tools available nowadays. However, there is a need to confirm the results of proteins function prediction using different programs and comparing the results or confirm experimentally. Here we present a bioinformatics pipeline for structural and functional annotation of proteins.

  10. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants.

    PubMed

    Huang, Ming-Der; Huang, Anthony H C

    2015-09-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins. PMID:26232488

  11. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants.

    PubMed

    Huang, Ming-Der; Huang, Anthony H C

    2015-09-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins.

  12. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants1[OPEN

    PubMed Central

    Huang, Ming-Der; Huang, Anthony H.C.

    2015-01-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins. PMID:26232488

  13. Multithreaded comparative RNA secondary structure prediction using stochastic context-free grammars

    PubMed Central

    2011-01-01

    Background The prediction of the structure of large RNAs remains a particular challenge in bioinformatics, due to the computational complexity and low levels of accuracy of state-of-the-art algorithms. The pfold model couples a stochastic context-free grammar to phylogenetic analysis for a high accuracy in predictions, but the time complexity of the algorithm and underflow errors have prevented its use for long alignments. Here we present PPfold, a multithreaded version of pfold, which is capable of predicting the structure of large RNA alignments accurately on practical timescales. Results We have distributed both the phylogenetic calculations and the inside-outside algorithm in PPfold, resulting in a significant reduction of runtime on multicore machines. We have addressed the floating-point underflow problems of pfold by implementing an extended-exponent datatype, enabling PPfold to be used for large-scale RNA structure predictions. We have also improved the user interface and portability: alongside standalone executable and Java source code of the program, PPfold is also available as a free plugin to the CLC Workbenches. We have evaluated the accuracy of PPfold using BRaliBase I tests, and demonstrated its practical use by predicting the secondary structure of an alignment of 24 complete HIV-1 genomes in 65 minutes on an 8-core machine and identifying several known structural elements in the prediction. Conclusions PPfold is the first parallelized comparative RNA structure prediction algorithm to date. Based on the pfold model, PPfold is capable of fast, high-quality predictions of large RNA secondary structures, such as the genomes of RNA viruses or long genomic transcripts. The techniques used in the parallelization of this algorithm may be of general applicability to other bioinformatics algorithms. PMID:21501497

  14. Feature selection in bioinformatics

    NASA Astrophysics Data System (ADS)

    Wang, Lipo

    2012-06-01

    In bioinformatics, there are often a large number of input features. For example, there are millions of single nucleotide polymorphisms (SNPs) that are genetic variations which determine the dierence between any two unrelated individuals. In microarrays, thousands of genes can be proled in each test. It is important to nd out which input features (e.g., SNPs or genes) are useful in classication of a certain group of people or diagnosis of a given disease. In this paper, we investigate some powerful feature selection techniques and apply them to problems in bioinformatics. We are able to identify a very small number of input features sucient for tasks at hand and we demonstrate this with some real-world data.

  15. Forensic DNA and bioinformatics.

    PubMed

    Bianchi, Lucia; Liò, Pietro

    2007-03-01

    The field of forensic science is increasingly based on biomolecular data and many European countries are establishing forensic databases to store DNA profiles of crime scenes of known offenders and apply DNA testing. The field is boosted by statistical and technological advances such as DNA microarray sequencing, TFT biosensors, machine learning algorithms, in particular Bayesian networks, which provide an effective way of evidence organization and inference. The aim of this article is to discuss the state of art potentialities of bioinformatics in forensic DNA science. We also discuss how bioinformatics will address issues related to privacy rights such as those raised from large scale integration of crime, public health and population genetic susceptibility-to-diseases databases.

  16. A Guide to Bioinformatics for Immunologists

    PubMed Central

    Whelan, Fiona J.; Yap, Nicholas V. L.; Surette, Michael G.; Golding, G. Brian; Bowdish, Dawn M. E.

    2013-01-01

    Bioinformatics includes a suite of methods, which are cheap, approachable, and many of which are easily accessible without any sort of specialized bioinformatic training. Yet, despite this, bioinformatic tools are under-utilized by immunologists. Herein, we review a representative set of publicly available, easy-to-use bioinformatic tools using our own research on an under-annotated human gene, SCARA3, as an example. SCARA3 shares an evolutionary relationship with the class A scavenger receptors, but preliminary research showed that it was divergent enough that its function remained unclear. In our quest for more information about this gene – did it share gene sequence similarities to other scavenger receptors? Did it contain conserved protein domains? Where was it expressed in the human body? – we discovered the power and informative potential of publicly available bioinformatic tools designed for the novice in mind, which allowed us to hypothesize on the regulation, structure, and function of this protein. We argue that these tools are largely applicable to many facets of immunology research. PMID:24363654

  17. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  18. Adapting bioinformatics curricula for big data.

    PubMed

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs.

  19. Neuroinformatics: from bioinformatics to databasing the brain.

    PubMed

    Morse, Thomas M

    2008-01-01

    Neuroinformatics seeks to create and maintain web-accessible databases of experimental and computational data, together with innovative software tools, essential for understanding the nervous system in its normal function and in neurological disorders. Neuroinformatics includes traditional bioinformatics of gene and protein sequences in the brain; atlases of brain anatomy and localization of genes and proteins; imaging of brain cells; brain imaging by positron emission tomography (PET), functional magnetic resonance imaging (fMRI), electroencephalography (EEG), magnetoencephalography (MEG) and other methods; many electrophysiological recording methods; and clinical neurological data, among others. Building neuroinformatics databases and tools presents difficult challenges because they span a wide range of spatial scales and types of data stored and analyzed. Traditional bioinformatics, by comparison, focuses primarily on genomic and proteomic data (which of course also presents difficult challenges). Much of bioinformatics analysis focus on sequences (DNA, RNA, and protein molecules), as the type of data that are stored, compared, and sometimes modeled. Bioinformatics is undergoing explosive growth with the addition, for example, of databases that catalog interactions between proteins, of databases that track the evolution of genes, and of systems biology databases which contain models of all aspects of organisms. This commentary briefly reviews neuroinformatics with clarification of its relationship to traditional and modern bioinformatics.

  20. A structural bioinformatics approach for identifying proteins predisposed to bind linear epitopes on pre-selected target proteins.

    PubMed

    Choi, Eun Jung; Jacak, Ron; Kuhlman, Brian

    2013-04-01

    We have developed a protocol for identifying proteins that are predisposed to bind linear epitopes on target proteins of interest. The protocol searches through the protein database for proteins (scaffolds) that are bound to peptides with sequences similar to accessible, linear epitopes on the target protein. The sequence match is considered more significant if residues calculated to be important in the scaffold-peptide interaction are present in the target epitope. The crystal structure of the scaffold-peptide complex is then used as a template for creating a model of the scaffold bound to the target epitope. This model can then be used in conjunction with sequence optimization algorithms or directed evolution methods to search for scaffold mutations that further increase affinity for the target protein. To test the applicability of this approach we targeted three disease-causing proteins: a tuberculosis virulence factor (TVF), the apical membrane antigen (AMA) from malaria, and hemagglutinin from influenza. In each case the best scoring scaffold was tested, and binders with Kds equal to 37 μM and 50 nM for TVF and AMA, respectively, were identified. A web server (http://rosettadesign.med.unc.edu/scaffold/) has been created for performing the scaffold search process with user-defined target sequences.

  1. A novel method to compare protein structures using local descriptors

    PubMed Central

    2011-01-01

    Background Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. Results We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy). Conclusions DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at http://bioexploratorium.pl/EP/DEDAL. PMID:21849047

  2. Fold assessment for comparative protein structure modeling.

    PubMed

    Melo, Francisco; Sali, Andrej

    2007-11-01

    Accurate and automated assessment of both geometrical errors and incompleteness of comparative protein structure models is necessary for an adequate use of the models. Here, we describe a composite score for discriminating between models with the correct and incorrect fold. To find an accurate composite score, we designed and applied a genetic algorithm method that searched for a most informative subset of 21 input model features as well as their optimized nonlinear transformation into the composite score. The 21 input features included various statistical potential scores, stereochemistry quality descriptors, sequence alignment scores, geometrical descriptors, and measures of protein packing. The optimized composite score was found to depend on (1) a statistical potential z-score for residue accessibilities and distances, (2) model compactness, and (3) percentage sequence identity of the alignment used to build the model. The accuracy of the composite score was compared with the accuracy of assessment by single and combined features as well as by other commonly used assessment methods. The testing set was representative of models produced by automated comparative modeling on a genomic scale. The composite score performed better than any other tested score in terms of the maximum correct classification rate (i.e., 3.3% false positives and 2.5% false negatives) as well as the sensitivity and specificity across the whole range of thresholds. The composite score was implemented in our program MODELLER-8 and was used to assess models in the MODBASE database that contains comparative models for domains in approximately 1.3 million protein sequences.

  3. Improvement of Student Understanding of How Kinetic Data Facilitates the Determination of Amino Acid Catalytic Function through an Alkaline Phosphatase Structure/Mechanism Bioinformatics Exercise

    ERIC Educational Resources Information Center

    Grunwald, Sandra K.; Krueger, Katherine J.

    2008-01-01

    Laboratory exercises, which utilize alkaline phosphatase as a model enzyme, have been developed and used extensively in undergraduate biochemistry courses to illustrate enzyme steady-state kinetics. A bioinformatics laboratory exercise for the biochemistry laboratory, which complements the traditional alkaline phosphatase kinetics exercise, was…

  4. Company strategies for using bioinformatics.

    PubMed

    Bains, W

    1996-08-01

    Bioinformatics enables biotechnology companies to access and analyse their growing databases of experimental results, and to exploit public data from genome programmes and other sources. Traditionally occupying the domain of a 'guru' supplying answers to infrequent research questions, corporate bioinformatics is breaking down under the flood of data. New, more robust, professional and expandable systems will give scientists effective access to new tools. This review outlines how companies have evolved beyond the 'guru', and have organized their bioinformatics by acquiring or developing bioinformatics resources. It also describes why the biologist must be central to this process, and why this is a problem for computer professionals to solve, not for 'gurus'.

  5. Pattern recognition in bioinformatics.

    PubMed

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained.

  6. LXtoo: an integrated live Linux distribution for the bioinformatics community

    PubMed Central

    2012-01-01

    Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356

  7. Virtual Bioinformatics Distance Learning Suite

    ERIC Educational Resources Information Center

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  8. Bioinformatics meets parasitology.

    PubMed

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention.

  9. Comparing Factor Structures of Adolescent Psychopathology

    ERIC Educational Resources Information Center

    Verona, Edelyn; Javdani, Shabnam; Sprague, Jenessa

    2011-01-01

    Research on the structure of adolescent psychopathology can provide information on broad factors that underlie different forms of maladjustment in youths. Multiple studies from the literature on adult populations suggest that 2 factors, Internalizing and Externalizing, meaningfully comprise the factor structure of adult psychopathology (e.g.,…

  10. Virtual bioinformatics distance learning suite*.

    PubMed

    Tolvanen, Martti; Vihinen, Mauno

    2004-05-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material over the Internet. Currently, we provide two fully computer-based courses, "Introduction to Bioinformatics" and "Bioinformatics in Functional Genomics." Here we will discuss the application of distance learning in bioinformatics training and our experiences gained during the 3 years that we have run the courses, with about 400 students from a number of universities. The courses are available at bioinf.uta.fi.

  11. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    ERIC Educational Resources Information Center

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  12. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  13. Chapter 16: Text Mining for Translational Bioinformatics

    PubMed Central

    Cohen, K. Bretonnel; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research—translating basic science results into new interventions—and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  14. Uncertainty of Comparative Judgments and Multidimensional Structure

    ERIC Educational Resources Information Center

    Sjoberg, Lennart

    1975-01-01

    An analysis of preferences with respect to silhouette drawings of nude females is presented. Systematic intransitivities were discovered. The dispersions of differences (comparatal dispersons) were shown to reflect the multidimensional structure of the stimuli, a finding expected on the basis of prior work. (Author)

  15. Reactance, Restoration, and Cognitive Structure: Comparative Statics

    ERIC Educational Resources Information Center

    Bessarabova, Elena; Fink, Edward L.; Turner, Monique

    2013-01-01

    This study (N = 143) examined the effects of freedom threat on cognitive structures, using recycling as its topic. The results of a 2(Freedom Threat: low vs. high) x 2(Postscript: restoration vs. filler) plus 1(Control) experiment indicated that, relative to the control condition, high freedom threat created a boomerang effect for the targeted…

  16. A comparative study of the reported performance of ab initio protein structure prediction algorithms.

    PubMed

    Helles, Glennie

    2008-04-01

    Protein structure prediction is one of the major challenges in bioinformatics today. Throughout the past five decades, many different algorithmic approaches have been attempted, and although progress has been made the problem remains unsolvable even for many small proteins. While the general objective is to predict the three-dimensional structure from primary sequence, our current knowledge and computational power are simply insufficient to solve a problem of such high complexity. Some prediction algorithms do, however, appear to perform better than others, although it is not always obvious which ones they are and it is perhaps even less obvious why that is. In this review, the reported performance results from 18 different recently published prediction algorithms are compared. Furthermore, the general algorithmic settings most likely responsible for the difference in the reported performance are identified, and the specific settings of each of the 18 prediction algorithms are also compared. The average normalized r.m.s.d. scores reported range from 11.17 to 3.48. With a performance measure including both r.m.s.d. scores and CPU time, the currently best-performing prediction algorithm is identified to be the I-TASSER algorithm. Two of the algorithmic settings--protein representation and fragment assembly--were found to have definite positive influence on the running time and the predicted structures, respectively. There thus appears to be a clear benefit from incorporating this knowledge in the design of new prediction algorithms.

  17. Compare, Contrast, Comprehend: Using Compare-Contrast Text Structures with ELLs in K-3 Classrooms

    ERIC Educational Resources Information Center

    Dreher, Mariam Jean; Gray, Jennifer Letcher

    2009-01-01

    In this article, we describe how to help primary-grade English language learners use compare-contrast text structures. Specifically, we explain (a) how to teach students to identify the compare-contrast text structure, and to use this structure to support their comprehension, (b) how to use compare-contrast texts to activate and extend students'…

  18. Bioinformatic pipelines in Python with Leaf

    PubMed Central

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315

  19. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    PubMed

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  20. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    PubMed Central

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  1. Microbial bioinformatics 2020.

    PubMed

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  2. Integration of bioinformatics to biodegradation

    PubMed Central

    2014-01-01

    Bioinformatics and biodegradation are two primary scientific fields in applied microbiology and biotechnology. The present review describes development of various bioinformatics tools that may be applied in the field of biodegradation. Several databases, including the University of Minnesota Biocatalysis/Biodegradation database (UM-BBD), a database of biodegradative oxygenases (OxDBase), Biodegradation Network-Molecular Biology Database (Bionemo) MetaCyc, and BioCyc have been developed to enable access to information related to biochemistry and genetics of microbial degradation. In addition, several bioinformatics tools for predicting toxicity and biodegradation of chemicals have been developed. Furthermore, the whole genomes of several potential degrading bacteria have been sequenced and annotated using bioinformatics tools. PMID:24808763

  3. Bioinformatics strategies for the analysis of lipids.

    PubMed

    Wheelock, Craig E; Goto, Susumu; Yetukuri, Laxman; D'Alexandri, Fabio Luiz; Klukas, Christian; Schreiber, Falk; Oresic, Matej

    2009-01-01

    Owing to their importance in cellular physiology and pathology as well as to recent technological advances, the study of lipids has reemerged as a major research target. However, the structural diversity of lipids presents a number of analytical and informatics challenges. The field of lipidomics is a new postgenome discipline that aims to develop comprehensive methods for lipid analysis, necessitating concomitant developments in bioinformatics. The evolving research paradigm requires that new bioinformatics approaches accommodate genomic as well as high-level perspectives, integrating genome, protein, chemical and network information. The incorporation of lipidomics information into these data structures will provide mechanistic understanding of lipid functions and interactions in the context of cellular and organismal physiology. Accordingly, it is vital that specific bioinformatics methods be developed to analyze the wealth of lipid data being acquired. Herein, we present an overview of the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and application of its tools to the analysis of lipid data. We also describe a series of software tools and databases (KGML-ED, VANTED, MZmine, and LipidDB) that can be used for the processing of lipidomics data and biochemical pathway reconstruction, an important next step in the development of the lipidomics field.

  4. Bioconductor: open software development for computational biology and bioinformatics

    PubMed Central

    Gentleman, Robert C; Carey, Vincent J; Bates, Douglas M; Bolstad, Ben; Dettling, Marcel; Dudoit, Sandrine; Ellis, Byron; Gautier, Laurent; Ge, Yongchao; Gentry, Jeff; Hornik, Kurt; Hothorn, Torsten; Huber, Wolfgang; Iacus, Stefano; Irizarry, Rafael; Leisch, Friedrich; Li, Cheng; Maechler, Martin; Rossini, Anthony J; Sawitzki, Gunther; Smith, Colin; Smyth, Gordon; Tierney, Luke; Yang, Jean YH; Zhang, Jianhua

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples. PMID:15461798

  5. Comparative BioInformatics and Computational Toxicology

    EPA Science Inventory

    Reflecting the numerous changes in the field since the publication of the previous edition, this third edition of Developmental Toxicology focuses on the mechanisms of developmental toxicity and incorporates current technologies for testing in the risk assessment process.

  6. Bioinformatics of cardiovascular miRNA biology.

    PubMed

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'.

  7. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  8. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  9. Adapting bioinformatics curricula for big data

    PubMed Central

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  10. The use of antioptimization to compare alternative structural models

    NASA Technical Reports Server (NTRS)

    Gangadharan, S. N.; Nikolaidis, E.; Lee, K.; Haftka, R. T.

    1993-01-01

    Structural models are usually tested by comparing their response with that of a reference structure (an actual structure or a more refined model) to a limited number of arbitrary loads. This test is not always reliable because the loads are arbitrary. An antioptimization-based method is proposed to test structural models. This method compares a structural model with a reference model or an actual structure under the worst loading case that maximizes the error in the model. Specifically, the method identifies the loading case that maximizes the difference between the responses of two models of the same structure using optimization. This method can be used to design experiments in order to validate a structural model. It can also be applied to identify damage in a structure by determining the load that maximizes the difference in the behavior of the damaged and the intact structure. The proposed method is illustrated by applying it to a plate and an automotive structure.

  11. Bioinformatics Approach in Plant Genomic Research.

    PubMed

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  12. Taking Bioinformatics to Systems Medicine.

    PubMed

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  13. Generations of interdisciplinarity in bioinformatics

    PubMed Central

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  14. Evolution and physics in comparative protein structure modeling.

    PubMed

    Fiser, András; Feig, Michael; Brooks, Charles L; Sali, Andrej

    2002-06-01

    From a physical perspective, the native structure of a protein is a consequence of physical forces acting on the protein and solvent atoms during the folding process. From a biological perspective, the native structure of proteins is a result of evolution over millions of years. Correspondingly, there are two types of protein structure prediction methods, de novo prediction and comparative modeling. We review comparative protein structure modeling and discuss the incorporation of physical considerations into the modeling process. A good starting point for achieving this aim is provided by comparative modeling by satisfaction of spatial restraints. Incorporation of physical considerations is illustrated by an inclusion of solvation effects into the modeling of loops.

  15. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  16. Reproducible Bioinformatics Research for Biologists

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  17. Bioinformatics and the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  18. No-boundary thinking in bioinformatics research.

    PubMed

    Huang, Xiuzhen; Bruce, Barry; Buchan, Alison; Congdon, Clare Bates; Cramer, Carole L; Jennings, Steven F; Jiang, Hongmei; Li, Zenglu; McClure, Gail; McMullen, Rick; Moore, Jason H; Nanduri, Bindu; Peckham, Joan; Perkins, Andy; Polson, Shawn W; Rekepalli, Bhanu; Salem, Saeed; Specker, Jennifer; Wunsch, Donald; Xiong, Donghai; Zhang, Shuzhong; Zhao, Zhongming

    2013-11-06

    Currently there are definitions from many agencies and research societies defining "bioinformatics" as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT).

  19. Teaching Bioinformatics in Concert

    PubMed Central

    Goodman, Anya L.; Dekhtyar, Alex

    2014-01-01

    Can biology students without programming skills solve problems that require computational solutions? They can if they learn to cooperate effectively with computer science students. The goal of the in-concert teaching approach is to introduce biology students to computational thinking by engaging them in collaborative projects structured around the software development process. Our approach emphasizes development of interdisciplinary communication and collaboration skills for both life science and computer science students. PMID:25411792

  20. Teaching bioinformatics in concert.

    PubMed

    Goodman, Anya L; Dekhtyar, Alex

    2014-11-01

    Can biology students without programming skills solve problems that require computational solutions? They can if they learn to cooperate effectively with computer science students. The goal of the in-concert teaching approach is to introduce biology students to computational thinking by engaging them in collaborative projects structured around the software development process. Our approach emphasizes development of interdisciplinary communication and collaboration skills for both life science and computer science students. PMID:25411792

  1. Teaching bioinformatics in concert.

    PubMed

    Goodman, Anya L; Dekhtyar, Alex

    2014-11-01

    Can biology students without programming skills solve problems that require computational solutions? They can if they learn to cooperate effectively with computer science students. The goal of the in-concert teaching approach is to introduce biology students to computational thinking by engaging them in collaborative projects structured around the software development process. Our approach emphasizes development of interdisciplinary communication and collaboration skills for both life science and computer science students.

  2. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    PubMed Central

    Obom, Kristina M.; Cummings, Patrick J.

    2007-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation. PMID:23653816

  3. Bioinformatics in the information age

    SciTech Connect

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  4. Receptor-binding sites: bioinformatic approaches.

    PubMed

    Flower, Darren R

    2006-01-01

    It is increasingly clear that both transient and long-lasting interactions between biomacromolecules and their molecular partners are the most fundamental of all biological mechanisms and lie at the conceptual heart of protein function. In particular, the protein-binding site is the most fascinating and important mechanistic arbiter of protein function. In this review, I examine the nature of protein-binding sites found in both ligand-binding receptors and substrate-binding enzymes. I highlight two important concepts underlying the identification and analysis of binding sites. The first is based on knowledge: when one knows the location of a binding site in one protein, one can "inherit" the site from one protein to another. The second approach involves the a priori prediction of a binding site from a sequence or a structure. The full and complete analysis of binding sites will necessarily involve the full range of informatic techniques ranging from sequence-based bioinformatic analysis through structural bioinformatics to computational chemistry and molecular physics. Integration of both diverse experimental and diverse theoretical approaches is thus a mandatory requirement in the evaluation of binding sites and the binding events that occur within them. PMID:16671408

  5. Comparative modeling of InP solar cell structures

    NASA Technical Reports Server (NTRS)

    Jain, R. K.; Weinberg, I.; Flood, D. J.

    1991-01-01

    The comparative modeling of p(+)n and n(+)p indium phosphide solar cell structures is studied using a numerical program PC-1D. The optimal design study has predicted that the p(+)n structure offers improved cell efficiencies as compared to n(+)p structure, due to higher open-circuit voltage. The various cell material and process parameters to achieve the maximum cell efficiencies are reported. The effect of some of the cell parameters on InP cell I-V characteristics was studied. The available radiation resistance data on n(+)p and p(+)p InP solar cells are also critically discussed.

  6. A Bioinformatics Facility for NASA

    NASA Technical Reports Server (NTRS)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  7. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  8. Protein bioinformatics applied to virology.

    PubMed

    Mohabatkar, Hassan; Keyhanfar, Mehrnaz; Behbahani, Mandana

    2012-09-01

    Scientists have united in a common search to sequence, store and analyze genes and proteins. In this regard, rapidly evolving bioinformatics methods are providing valuable information on these newly-discovered molecules. Understanding what has been done and what we can do in silico is essential in designing new experiments. The unbalanced situation between sequence-known proteins and attribute-known proteins, has called for developing computational methods or high-throughput automated tools for fast and reliably predicting or identifying various characteristics of uncharacterized proteins. Taking into consideration the role of viruses in causing diseases and their use in biotechnology, the present review describes the application of protein bioinformatics in virology. Therefore, a number of important features of viral proteins like epitope prediction, protein docking, subcellular localization, viral protease cleavage sites and computer based comparison of their aspects have been discussed. This paper also describes several tools, principally developed for viral bioinformatics. Prediction of viral protein features and learning the advances in this field can help basic understanding of the relationship between a virus and its host.

  9. A services oriented system for bioinformatics applications on the grid.

    PubMed

    Aloisio, Giovanni; Cafaro, Massimo; Epicoco, Italo; Fiore, Sandro; Mirto, Maria

    2007-01-01

    This paper describes the evolution of the main services of the ProGenGrid (Proteomics & Genomics Grid) system, a distributed and ubiquitous grid environment ("virtual laboratory"), based on Workflow and supporting the design, execution and monitoring of "in silico" experiments in bioinformatics.ProGenGrid is a Grid-based Problem Solving Environment that allows the composition of data sources and bioinformatics programs wrapped as Web Services (WS). The use of WS provides ease of use and fosters re-use. The resulting workflow of WS is then scheduled on the Grid, leveraging Grid-middleware services. In particular, ProGenGrid offers a modular bag of services and currently is focused on the biological simulation of two important bioinformatics problems: prediction of the secondary structure of proteins, and sequence alignment of proteins. Both services are based on an enhanced data access service.

  10. Robust enzyme design: bioinformatic tools for improved protein stability.

    PubMed

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.

  11. Comparative testing of nondestructive examination techniques for concrete structures

    NASA Astrophysics Data System (ADS)

    Clayton, Dwight A.; Smith, Cyrus M.

    2014-03-01

    A multitude of concrete-based structures are typically part of a light water reactor (LWR) plant to provide foundation, support, shielding, and containment functions. Concrete has been used in the construction of nuclear power plants (NPPs) because of three primary properties, its inexpensiveness, its structural strength, and its ability to shield radiation. Examples of concrete structures important to the safety of LWR plants include containment building, spent fuel pool, and cooling towers. Comparative testing of the various NDE concrete measurement techniques requires concrete samples with known material properties, voids, internal microstructure flaws, and reinforcement locations. These samples can be artificially created under laboratory conditions where the various properties can be controlled. Other than NPPs, there are not many applications where critical concrete structures are as thick and reinforced. Therefore, there are not many industries other than the nuclear power plant or power plant industry that are interested in performing NDE on thick and reinforced concrete structures. This leads to the lack of readily available samples of thick and heavily reinforced concrete for performing NDE evaluations, research, and training. The industry that typically performs the most NDE on concrete structures is the bridge and roadway industry. While bridge and roadway structures are thinner and less reinforced, they have a good base of NDE research to support their field NDE programs to detect, identify, and repair concrete failures. This paper will summarize the initial comparative testing of two concrete samples with an emphasis on how these techniques could perform on NPP concrete structures.

  12. Structure and comparative morphology of camptotrichia of lungfish fins.

    PubMed

    Geraudie, J; Meunier, F J

    1984-01-01

    The present work is devoted to the organization and ultrastructure of the fin rays or camptotrichia of two living Dipnoi (lungfishes) Protopterus and Neoceratodus. In both species, these rods have a dual structure: only the superficial region facing the stratified epidermis is mineralized while the deep one is made of a dense unmineralized network of collagen fibrils forming a permanent pre-osseous tissue. Only the camptotrichia of Neoceratodus is made of cellular bone. This study confirms the structural peculiarities of these camptotrichia when compared to the dermal skeleton of the Actinopterygii constituted by the bony lepidotrichia and the actinotrichia. These results are discussed and compared to fossil dipnoan fin rays. PMID:6740649

  13. Comparative genomics for understanding the structure, function and sub-cellular localization of hypothetical proteins in Thermanerovibrio acidaminovorans DSM 6589 (tai).

    PubMed

    Thakare, Hitesh S; Meshram, Dilip B; Jangam, Chandrakant M; Labhasetwar, Pawan; Roychoudhary, Kunal; Ingle, Arun B

    2016-04-01

    The Thermanerovibrio acidaminovorans DSM 6589 (tai) is a unique bacterium isolated from anaerobic sludge bed reactor from sugar refinery in Netherland. The comparative genomic studies for understanding the hypothetical proteins in T. acidaminovorans DSM 6589 (tai) were carried out using different bioinformatic tools and web servers. In all 320 hypothetical proteins were screened from the total available genome. The Insilico function prediction for 320 hypothetical proteins was achieved by using different online servers like CDD-Blast, Interproscan and pfam whereas, the structure prediction for 202 hypothetical proteins were deciphered by using protein structure prediction server (PS2 server). The sub-cellular localization for the identified proteins was predicted by the use of cello v2.5 for 320. The study carried out has helped us to understand the structures and functions of unknown proteins available in T. acidaminovorans DSM 6589 (tai) through comparative genomic approach. PMID:26930563

  14. A comparative study of ship hull structures fatigue assessment methods

    NASA Astrophysics Data System (ADS)

    Petinov, Sergei V.; Polezhayeva, Helena A.; Yermolayeva, Natalya S.

    1992-07-01

    Several methods of fatigue assessment in ship hull structures are compared. The analysis is focused on fatigue problems of hull structures concerning: evaluation, the design state of fatigue damage of a structure formulation, and the adequacy of methods and data bases for the purpose of the analyses. To illustrate the discussion, examples of allowable nominal stress at a given fatigue life calculation are presented for bottom frame web slot and for a bottom longitudinal transverse bulkhead bracket connection in the case of a container ship. The low cycle (local strain) method is regarded as the most advantageous at present almost in all practical problems connected to fatigue.

  15. Bioinformatics analysis of the epitope regions for norovirus capsid protein

    PubMed Central

    2013-01-01

    Background Norovirus is the major cause of nonbacterial epidemic gastroenteritis, being highly prevalent in both developing and developed countries. Despite of the available monoclonal antibodies (MAbs) for different sub-genogroups, a comprehensive epitope analysis based on various bioinformatics technology is highly desired for future potential antibody development in clinical diagonosis and treatment. Methods A total of 18 full-length human norovirus capsid protein sequences were downloaded from GenBank. Protein modeling was performed with program Modeller 9.9. The modeled 3D structures of capsid protein of norovirus were submitted to the protein antigen spatial epitope prediction webserver (SEPPA) for predicting the possible spatial epitopes with the default threshold. The results were processed using the Biosoftware. Results Compared with GI, we found that the GII genogroup had four deletions and two special insertions in the VP1 region. The predicted conformational epitope regions mainly concentrated on N-terminal (1~96), Middle Part (298~305, 355~375) and C-terminal (560~570). We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. Conclusions The predicted conformational epitope regions of norovirus VP1 mainly concentrated on N-terminal, Middle Part and C-terminal. We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. The overlapping with experimental epitopes indicates the important role of latest computational technologies. With the fast development of computational immunology tools, the bioinformatics pipeline will be more and more critical to vaccine design. PMID:23514273

  16. DSSTOX STRUCTURE-SEARCHABLE PUBLIC TOXICITY DATABASE NETWORK: CURRENT PROGRESS AND NEW INITIATIVES TO IMPROVE CHEMO-BIOINFORMATICS CAPABILITIES

    EPA Science Inventory

    The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...

  17. Structures of School Systems Worldwide: A Comparative Study

    ERIC Educational Resources Information Center

    Popov, Nikolay

    2012-01-01

    In the past 20 years I have been examining the structures of school systems worldwide. This ongoing research has been enriched by the findings obtained from the lecture course on Comparative Education I have been delivering to students in the Bachelor and Master's Education Programs at Sofia University, Bulgaria. This paper presents some results…

  18. Bioinformatics in Africa: The Rise of Ghana?

    PubMed

    Karikari, Thomas K

    2015-09-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics.

  19. Bioinformatics in Africa: The Rise of Ghana?

    PubMed Central

    Karikari, Thomas K.

    2015-01-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  20. Comparing High-latitude Ionospheric and Thermospheric Lagrangian Coherent Structures

    NASA Astrophysics Data System (ADS)

    Wang, N.; Ramirez, U.; Flores, F.; Okic, D.; Datta-Barua, S.

    2015-12-01

    Lagrangian Coherent Structures (LCSs) are invisible boundaries in time varying flow fields that may be subject to mixing and turbulence. The LCS is defined by the local maxima of the finite time Lyapunov exponent (FTLE), a scalar field quantifying the degree of stretching of fluid elements over the flow domain. Although the thermosphere is dominated by neutral wind processes and the ionosphere is governed by plasma electrodynamics, we can compare the LCS in the two modeled flow fields to yield insight into transport and interaction processes in the high-latitude IT system. For obtaining thermospheric LCS, we use the Horizontal Wind Model 2014 (HWM14) [1] at a single altitude to generate the two-dimensional velocity field. The FTLE computation is applied to study the flow field of the neutral wind, and to visualize the forward-time Lagrangian Coherent Structures in the flow domain. The time-varying structures indicate a possible thermospheric LCS ridge in the auroral oval area. The results of a two-day run during a geomagnetically quiet period show that the structures are diurnally quasi-periodic, thus that solar radiation influences the neutral wind flow field. To find the LCS in the high-latitude ionospheric drifts, the Weimer 2001 [2] polar electric potential model and the International Geomagnetic Reference Field 11 [3] are used to compute the ExB drift flow field in ionosphere. As with the neutral winds, the Lagrangian Coherent Structures are obtained by applying the FTLE computation. The relationship between the thermospheric and ionospheric LCS is analyzed by comparing overlapping FTLE maps. Both a publicly available FTLE solver [4] and a custom-built FTLE computation are used and compared for validation [5]. Comparing the modeled IT LCSs on a quiet day with the modeled IT LCSs on a storm day indicates important factors on the structure and time evolution of the LCS.

  1. Bioinformatics by Example: From Sequence to Target

    NASA Astrophysics Data System (ADS)

    Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

    2002-12-01

    With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.

  2. Rapid Bioinformatic Identification of Thermostabilizing Mutations

    PubMed Central

    Sauer, David B.; Karpowich, Nathan K.; Song, Jin Mei; Wang, Da-Neng

    2015-01-01

    Ex vivo stability is a valuable protein characteristic but is laborious to improve experimentally. In addition to biopharmaceutical and industrial applications, stable protein is important for biochemical and structural studies. Taking advantage of the large number of available genomic sequences and growth temperature data, we present two bioinformatic methods to identify a limited set of amino acids or positions that likely underlie thermostability. Because these methods allow thousands of homologs to be examined in silico, they have the advantage of providing both speed and statistical power. Using these methods, we introduced, via mutation, amino acids from thermoadapted homologs into an exemplar mesophilic membrane protein, and demonstrated significantly increased thermostability while preserving protein activity. PMID:26445442

  3. The European Bioinformatics Institute's data resources.

    PubMed

    Brooksbank, Catherine; Camon, Evelyn; Harris, Midori A; Magrane, Michele; Martin, Maria Jesus; Mulder, Nicola; O'Donovan, Claire; Parkinson, Helen; Tuli, Mary Ann; Apweiler, Rolf; Birney, Ewan; Brazma, Alvis; Henrick, Kim; Lopez, Rodrigo; Stoesser, Guenter; Stoehr, Peter; Cameron, Graham

    2003-01-01

    As the amount of biological data grows, so does the need for biologists to store and access this information in central repositories in a free and unambiguous manner. The European Bioinformatics Institute (EBI) hosts six core databases, which store information on DNA sequences (EMBL-Bank), protein sequences (SWISS-PROT and TrEMBL), protein structure (MSD), whole genomes (Ensembl) and gene expression (ArrayExpress). But just as a cell would be useless if it couldn't transcribe DNA or translate RNA, our resources would be compromised if each existed in isolation. We have therefore developed a range of tools that not only facilitate the deposition and retrieval of biological information, but also allow users to carry out searches that reflect the interconnectedness of biological information. The EBI's databases and tools are all available on our website at www.ebi.ac.uk. PMID:12519944

  4. Bioinformatics for personal genome interpretation.

    PubMed

    Capriotti, Emidio; Nehrt, Nathan L; Kann, Maricel G; Bromberg, Yana

    2012-07-01

    An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.

  5. Accuracy of functional surfaces on comparatively modeled protein structures

    PubMed Central

    Zhao, Jieling; Dundas, Joe; Kachalo, Sema; Ouyang, Zheng; Liang, Jie

    2012-01-01

    Identification and characterization of protein functional surfaces are important for predicting protein function, understanding enzyme mechanism, and docking small compounds to proteins. As the rapid speed of accumulation of protein sequence information far exceeds that of structures, constructing accurate models of protein functional surfaces and identify their key elements become increasingly important. A promising approach is to build comparative models from sequences using known structural templates such as those obtained from structural genome projects. Here we assess how well this approach works in modeling binding surfaces. By systematically building three-dimensional comparative models of proteins using Modeller, we determine how well functional surfaces can be accurately reproduced. We use an alpha shape based pocket algorithm to compute all pockets on the modeled structures, and conduct a large-scale computation of similarity measurements (pocket RMSD and fraction of functional atoms captured) for 26,590 modeled enzyme protein structures. Overall, we find that when the sequence fragment of the binding surfaces has more than 45% identity to that of the tempalte protein, the modeled surfaces have on average an RMSD of 0.5 Å, and contain 48% or more of the binding surface atoms, with nearly all of the important atoms in the signatures of binding pockets captured. PMID:21541664

  6. Evolving Strategies for the Incorporation of Bioinformatics Within the Undergraduate Cell Biology Curriculum

    PubMed Central

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489

  7. Structural Bioinformatics and Protein Docking Analysis of the Molecular Chaperone-Kinase Interactions: Towards Allosteric Inhibition of Protein Kinases by Targeting the Hsp90-Cdc37 Chaperone Machinery

    PubMed Central

    Lawless, Nathan; Blacklock, Kristin; Berrigan, Elizabeth; Verkhivker, Gennady

    2013-01-01

    A fundamental role of the Hsp90-Cdc37 chaperone system in mediating maturation of protein kinase clients and supporting kinase functional activity is essential for the integrity and viability of signaling pathways involved in cell cycle control and organism development. Despite significant advances in understanding structure and function of molecular chaperones, the molecular mechanisms and guiding principles of kinase recruitment to the chaperone system are lacking quantitative characterization. Structural and thermodynamic characterization of Hsp90-Cdc37 binding with protein kinase clients by modern experimental techniques is highly challenging, owing to a transient nature of chaperone-mediated interactions. In this work, we used experimentally-guided protein docking to probe the allosteric nature of the Hsp90-Cdc37 binding with the cyclin-dependent kinase 4 (Cdk4) kinase clients. The results of docking simulations suggest that the kinase recognition and recruitment to the chaperone system may be primarily determined by Cdc37 targeting of the N-terminal kinase lobe. The interactions of Hsp90 with the C-terminal kinase lobe may provide additional “molecular brakes” that can lock (or unlock) kinase from the system during client loading (release) stages. The results of this study support a central role of the Cdc37 chaperone in recognition and recruitment of the kinase clients. Structural analysis may have useful implications in developing strategies for allosteric inhibition of protein kinases by targeting the Hsp90-Cdc37 chaperone machinery. PMID:24287464

  8. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  9. Structural and bioinformatic analysis of the kiwifruit allergen Act d 11, a member of the family of ripening-related proteins.

    PubMed

    Chruszcz, Maksymilian; Ciardiello, Maria Antonietta; Osinski, Tomasz; Majorek, Karolina A; Giangrieco, Ivana; Font, Jose; Breiteneder, Heimo; Thalassinos, Konstantinos; Minor, Wladek

    2013-12-01

    The allergen Act d 11, also known as kirola, is a 17 kDa protein expressed in large amounts in ripe green and yellow-fleshed kiwifruit. Ten percent of all kiwifruit-allergic individuals produce IgE specific for the protein. Using X-ray crystallography, we determined the first three-dimensional structures of Act d 11, produced from both recombinant expression in Escherichia coli and from the natural source (kiwifruit). While Act d 11 is immunologically correlated with the birch pollen allergen Bet v 1 and other members of the pathogenesis-related protein family 10 (PR-10), it has low sequence similarity to PR-10 proteins. By sequence Act d 11 appears instead to belong to the major latex/ripening-related (MLP/RRP) family, but analysis of the crystal structures shows that Act d 11 has a fold very similar to that of Bet v 1 and other PR-10 related allergens regardless of the low sequence identity. The structures of both the natural and recombinant protein include an unidentified ligand, which is relatively small (about 250 Da by mass spectrometry experiments) and most likely contains an aromatic ring. The ligand-binding cavity in Act d 11 is also significantly smaller than those in PR-10 proteins. The binding of the ligand, which we were not able to unambiguously identify, results in conformational changes in the protein that may have physiological and immunological implications. Interestingly, the residue corresponding to Glu45 in Bet v 1 (Glu46), which is important for IgE binding to the birch pollen allergen, is conserved in Act d 11, even though it is not in other allergens with significantly higher sequence identity to Bet v 1. We suggest that the so-called Gly-rich loop (or P-loop), which is conserved in all PR-10 allergens, may be responsible for IgE cross-reactivity between Bet v 1 and Act d 11.

  10. Structural bioinformatics analysis of enzymes involved in the biosynthesis pathway of the hypermodified nucleoside ms(2)io(6)A37 in tRNA.

    PubMed

    Kaminska, Katarzyna H; Baraniak, Urszula; Boniecki, Michal; Nowaczyk, Katarzyna; Czerwoniec, Anna; Bujnicki, Janusz M

    2008-01-01

    TRNAs from all organisms contain posttranscriptionally modified nucleosides, which are derived from the four canonical nucleosides. In most tRNAs that read codons beginning with U, adenosine in the position 37 adjacent to the 3' position of the anticodon is modified to N(6)-(Delta(2)-isopentenyl) adenosine (i(6)A). In many bacteria, such as Escherichia coli, this residue is typically hypermodified to N(6)-isopentenyl-2-thiomethyladenosine (ms(2)i(6)A). In a few bacteria, such as Salmonella typhimurium, ms(2)i(6)A can be further hydroxylated to N(6)-(cis-4-hydroxyisopentenyl)-2-thiomethyladenosine (ms(2)io(6)A). Although the enzymes that introduce the respective modifications (prenyltransferase MiaA, methylthiotransferase MiaB, and hydroxylase MiaE) have been identified, their structures remain unknown and sequence-function relationships remain obscure. We carried out sequence analysis and structure prediction of MiaA, MiaB, and MiaE, using the protein fold-recognition approach. Three-dimensional models of all three proteins were then built using a new modeling protocol designed to overcome uncertainties in the alignments and divergence between the templates. For MiaA and MiaB, the catalytic core was built based on the templates from the P-loop NTPase and Radical-SAM superfamilies, respectively. For MiaB, we have also modeled the C-terminal TRAM domain and the newly predicted N-terminal flavodoxin-fold domain. For MiaE, we confidently predict that it shares the three-dimensional fold with the ferritin-like four-helix bundle proteins and that it has a similar active site and mechanism of action to diiron carboxylate enzymes, in particular, methane monooxygenase (E.C.1.14.13.25) that catalyses the biological hydroxylation of alkanes. Our models provide the first structural platform for enzymes involved in the biosynthesis of i(6)A, ms(2)i(6)A, and ms(2)io(6)A, explain the data available from the literature and will help to design further experiments and interpret

  11. Technical phosphoproteomic and bioinformatic tools useful in cancer research

    PubMed Central

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  12. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    PubMed

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  13. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    PubMed

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  14. Mathematics and evolutionary biology make bioinformatics education comprehensible

    PubMed Central

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  15. Rapid Development of Bioinformatics Education in China

    ERIC Educational Resources Information Center

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  16. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Cancer.gov

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  17. Biology in 'silico': The Bioinformatics Revolution.

    ERIC Educational Resources Information Center

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  18. A Mathematical Optimization Problem in Bioinformatics

    ERIC Educational Resources Information Center

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  19. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  20. The 2016 Bioinformatics Open Source Conference (BOSC)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science. PMID:27781083

  1. Design and bioinformatics analysis of genome-wide CLIP experiments

    PubMed Central

    Wang, Tao; Xiao, Guanghua; Chu, Yongjun; Zhang, Michael Q.; Corey, David R.; Xie, Yang

    2015-01-01

    The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP–RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses. PMID:25958398

  2. Comparative population genetic structures and local adaptation of two mutualists.

    PubMed

    Anderson, Bruce; Olivieri, Isabelle; Lourmas, Mathieu; Stewart, Barbara A

    2004-08-01

    Similar patterns of dispersal and gene flow between closely associated organisms may promote local adaptation and coevolutionary processes. We compare the genetic structures of the two species of a plant genus (Roridula gorgonias and R. dentata) and their respective obligately associated hemipteran mutualists (Pameridea roridulae and P. marlothi) using allozymes. In addition, we determine whether genetic structure is related to differences in host choice by Pameridea. Allozyme variation was found to be very structured among plant populations but less so among hemipteran populations. Strong genetic structuring among hemipteran populations was only evident when large distances isolated the plant populations on which they live. Although genetic distances among plant populations were correlated with genetic distances among hemipteran populations, genetic distances of both plants and hemipterans were better correlated with geographic distance. Because Roridula and Pameridea have different scales of gene flow, adaptation at the local population level is unlikely. However, the restricted gene flow of both plants and hemipterans could enable adaptation to occur at a regional level. In choice experiments, the hemipteran (Pameridea) has a strong preference for its carnivorous host plant (Roridula) above unrelated host plants. Pameridea also prefers its host species to its closely related sister species. Specialization at the specific level is likely to reinforce cospeciation processes in this mutualism. However, Pameridea does not exhibit intraspecific preferences toward plants from their natal populations above plants from isolated, non-natal populations. PMID:15446426

  3. A comparative structural study of wet and dried ettringite

    SciTech Connect

    Renaudin, G.; Filinchuk, Y.; Neubauer, J.; Goetz-Neunhoeffer, F.

    2010-03-15

    Two different techniques were used to compare structural characteristics of 'wet' ettringite (stored in the synthesis mother liquid) and 'dried' ettringite (dried to 35% relative humidity over saturated CaCl{sub 2} solution). Lattice parameters and the water content in the channel region of the structure (site occupancy factor of the water molecule not bonded to cations) as well as microstructure parameters (size and strain) were determined from a Rietveld refinement on synchrotron powder diffraction data. Local environment of sulphate anions and of the hydrogen bonding network was characterized by Raman spectroscopy. Both techniques led to the same conclusion: the 'wet' ettringite sample immersed in the mother solution from the synthesis presents similar structural features as ettringite dried to 35% relative humidity. An increase of the a lattice parameter combined with a decrease of the c lattice parameter occurs on drying. The amount of structural water, the point symmetry of sulphate and the hydrogen bond network are unchanged when passing from the wet to the dried ettringite powder. Ettringite does not form a high-hydrate polymorph in equilibrium with alkaline solution, in contrast to the AFm phases that lose water molecules on drying. According to these results we conclude that ettringite precipitated in aqueous solution at the early hydration stages is of the same chemical composition as ettringite present in the hardening concrete.

  4. Comparing molecules and solids across structural and alchemical space.

    PubMed

    De, Sandip; Bartók, Albert P; Csányi, Gábor; Ceriotti, Michele

    2016-05-18

    Evaluating the (dis)similarity of crystalline, disordered and molecular compounds is a critical step in the development of algorithms to navigate automatically the configuration space of complex materials. For instance, a structural similarity metric is crucial for classifying structures, searching chemical space for better compounds and materials, and driving the next generation of machine-learning techniques for predicting the stability and properties of molecules and materials. In the last few years several strategies have been designed to compare atomic coordination environments. In particular, the smooth overlap of atomic positions (SOAPs) has emerged as an elegant framework to obtain translation, rotation and permutation-invariant descriptors of groups of atoms, underlying the development of various classes of machine-learned inter-atomic potentials. Here we discuss how one can combine such local descriptors using a regularized entropy match (REMatch) approach to describe the similarity of both whole molecular and bulk periodic structures, introducing powerful metrics that enable the navigation of alchemical and structural complexities within a unified framework. Furthermore, using this kernel and a ridge regression method we can predict atomization energies for a database of small organic molecules with a mean absolute error below 1 kcal mol(-1), reaching an important milestone in the application of machine-learning techniques for the evaluation of molecular properties. PMID:27101873

  5. Comparative population genetic structures and local adaptation of two mutualists.

    PubMed

    Anderson, Bruce; Olivieri, Isabelle; Lourmas, Mathieu; Stewart, Barbara A

    2004-08-01

    Similar patterns of dispersal and gene flow between closely associated organisms may promote local adaptation and coevolutionary processes. We compare the genetic structures of the two species of a plant genus (Roridula gorgonias and R. dentata) and their respective obligately associated hemipteran mutualists (Pameridea roridulae and P. marlothi) using allozymes. In addition, we determine whether genetic structure is related to differences in host choice by Pameridea. Allozyme variation was found to be very structured among plant populations but less so among hemipteran populations. Strong genetic structuring among hemipteran populations was only evident when large distances isolated the plant populations on which they live. Although genetic distances among plant populations were correlated with genetic distances among hemipteran populations, genetic distances of both plants and hemipterans were better correlated with geographic distance. Because Roridula and Pameridea have different scales of gene flow, adaptation at the local population level is unlikely. However, the restricted gene flow of both plants and hemipterans could enable adaptation to occur at a regional level. In choice experiments, the hemipteran (Pameridea) has a strong preference for its carnivorous host plant (Roridula) above unrelated host plants. Pameridea also prefers its host species to its closely related sister species. Specialization at the specific level is likely to reinforce cospeciation processes in this mutualism. However, Pameridea does not exhibit intraspecific preferences toward plants from their natal populations above plants from isolated, non-natal populations.

  6. Comparing molecules and solids across structural and alchemical space.

    PubMed

    De, Sandip; Bartók, Albert P; Csányi, Gábor; Ceriotti, Michele

    2016-05-18

    Evaluating the (dis)similarity of crystalline, disordered and molecular compounds is a critical step in the development of algorithms to navigate automatically the configuration space of complex materials. For instance, a structural similarity metric is crucial for classifying structures, searching chemical space for better compounds and materials, and driving the next generation of machine-learning techniques for predicting the stability and properties of molecules and materials. In the last few years several strategies have been designed to compare atomic coordination environments. In particular, the smooth overlap of atomic positions (SOAPs) has emerged as an elegant framework to obtain translation, rotation and permutation-invariant descriptors of groups of atoms, underlying the development of various classes of machine-learned inter-atomic potentials. Here we discuss how one can combine such local descriptors using a regularized entropy match (REMatch) approach to describe the similarity of both whole molecular and bulk periodic structures, introducing powerful metrics that enable the navigation of alchemical and structural complexities within a unified framework. Furthermore, using this kernel and a ridge regression method we can predict atomization energies for a database of small organic molecules with a mean absolute error below 1 kcal mol(-1), reaching an important milestone in the application of machine-learning techniques for the evaluation of molecular properties.

  7. Comparative population structure of cavity-nesting sea ducks

    USGS Publications Warehouse

    Pearce, John M.; Eadie, John M.; Savard, Jean-Pierre L.; Christensen, Thomas K.; Berdeen, James; Taylor, Eric J.; Boyd, Sean; Einarsson, Árni

    2014-01-01

    A growing collection of mtDNA genetic information from waterfowl species across North America suggests that larger-bodied cavity-nesting species exhibit greater levels of population differentiation than smaller-bodied congeners. Although little is known about nest-cavity availability for these species, one hypothesis to explain differences in population structure is reduced dispersal tendency of larger-bodied cavity-nesting species due to limited abundance of large cavities. To investigate this hypothesis, we examined population structure of three cavity-nesting waterfowl species distributed across much of North America: Barrow's Goldeneye (Bucephala islandica), Common Goldeneye (B. clangula), and Bufflehead (B. albeola). We compared patterns of population structure using both variation in mtDNA control-region sequences and band-recovery data for the same species and geographic regions. Results were highly congruent between data types, showing structured population patterns for Barrow's and Common Goldeneye but not for Bufflehead. Consistent with our prediction, the smallest cavity-nesting species, the Bufflehead, exhibited the lowest level of population differentiation due to increased dispersal and gene flow. Results provide evidence for discrete Old and New World populations of Common Goldeneye and for differentiation of regional groups of both goldeneye species in Alaska, the Pacific Northwest, and the eastern coast of North America. Results presented here will aid management objectives that require an understanding of population delineation and migratory connectivity between breeding and wintering areas. Comparative studies such as this one highlight factors that may drive patterns of genetic diversity and population trends.

  8. Computational biology and bioinformatics in Nigeria.

    PubMed

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  9. Computational Biology and Bioinformatics in Nigeria

    PubMed Central

    Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-01-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310

  10. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  11. When cloud computing meets bioinformatics: a review.

    PubMed

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  12. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  13. Bioinformatic challenges in targeted proteomics.

    PubMed

    Reker, Daniel; Malmström, Lars

    2012-09-01

    Selected reaction monitoring mass spectrometry is an emerging targeted proteomics technology that allows for the investigation of complex protein samples with high sensitivity and efficiency. It requires extensive knowledge about the sample for the many parameters needed to carry out the experiment to be set appropriately. Most studies today rely on parameter estimation from prior studies, public databases, or from measuring synthetic peptides. This is efficient and sound, but in absence of prior data, de novo parameter estimation is necessary. Computational methods can be used to create an automated framework to address this problem. However, the number of available applications is still small. This review aims at giving an orientation on the various bioinformatical challenges. To this end, we state the problems in classical machine learning and data mining terms, give examples of implemented solutions and provide some room for alternatives. This will hopefully lead to an increased momentum for the development of algorithms and serve the needs of the community for computational methods. We note that the combination of such methods in an assisted workflow will ease both the usage of targeted proteomics in experimental studies as well as the further development of computational approaches. PMID:22866949

  14. Identifiying human MHC supertypes using bioinformatic methods.

    PubMed

    Doytchinova, Irini A; Guan, Pingping; Flower, Darren R

    2004-04-01

    Classification of MHC molecules into supertypes in terms of peptide-binding specificities is an important issue, with direct implications for the development of epitope-based vaccines with wide population coverage. In view of extremely high MHC polymorphism (948 class I and 633 class II HLA alleles) the experimental solution of this task is presently impossible. In this study, we describe a bioinformatics strategy for classifying MHC molecules into supertypes using information drawn solely from three-dimensional protein structure. Two chemometric techniques-hierarchical clustering and principal component analysis-were used independently on a set of 783 HLA class I molecules to identify supertypes based on structural similarities and molecular interaction fields calculated for the peptide binding site. Eight supertypes were defined: A2, A3, A24, B7, B27, B44, C1, and C4. The two techniques gave 77% consensus, i.e., 605 HLA class I alleles were classified in the same supertype by both methods. The proposed strategy allowed "supertype fingerprints" to be identified. Thus, the A2 supertype fingerprint is Tyr(9)/Phe(9), Arg(97), and His(114) or Tyr(116); the A3-Tyr(9)/Phe(9)/Ser(9), Ile(97)/Met(97) and Glu(114) or Asp(116); the A24-Ser(9) and Met(97); the B7-Asn(63) and Leu(81); the B27-Glu(63) and Leu(81); for B44-Ala(81); the C1-Ser(77); and the C4-Asn(77). PMID:15034046

  15. Evolution in bioinformatic resources: 2009 update on the Bioinformatics Links Directory.

    PubMed

    Brazas, Michelle D; Yamada, Joseph Tadashi; Ouellette, B F Francis

    2009-07-01

    All of the life science research web servers published in this and previous issues of Nucleic Acids Research, together with other useful tools, databases and resources for bioinformatics and molecular biology research are freely accessible online through the Bioinformatics Links Directory, http://bioinformatics.ca/links_directory/. Entirely dependent on user feedback and community input, the Bioinformatics Links Directory exemplifies an open access research tool and resource. With 112 websites featured in the July 2009 Web Server Issue of Nucleic Acids Research, the 2009 update brings the total number of servers listed in the Bioinformatics Links Directory close to an impressive 1400 links. A complete list of all links listed in this Nucleic Acids Research 2009 Web Server Issue can be accessed online at http://bioinfomatics.ca/links_directory/narweb2009/. The 2009 update of the Bioinformatics Links Directory, which includes the Web Server list and summaries, is also available online at the Nucleic Acids Research website, http://nar.oxfordjournals.org/.

  16. Structure, function and evolution of the gas exchangers: comparative perspectives.

    PubMed

    Maina, J N

    2002-10-01

    Over the evolutionary continuum, animals have faced similar fundamental challenges of acquiring molecular oxygen for aerobic metabolism. Under limitations and constraints imposed by factors such as phylogeny, behaviour, body size and environment, they have responded differently in founding optimal respiratory structures. A quintessence of the aphorism that 'necessity is the mother of invention', gas exchangers have been inaugurated through stiff cost-benefit analyses that have evoked transaction of trade-offs and compromises. Cogent structural-functional correlations occur in constructions of gas exchangers: within and between taxa, morphological complexity and respiratory efficiency increase with metabolic capacities and oxygen needs. Highly active, small endotherms have relatively better-refined gas exchangers compared with large, inactive ectotherms. Respiratory structures have developed from the plain cell membrane of the primeval prokaryotic unicells to complex multifunctional ones of the modern Metazoa. Regarding the respiratory medium used to extract oxygen from, animal life has had only two choices--water or air--within the biological range of temperature and pressure the only naturally occurring respirable fluids. In rarer cases, certain animals have adapted to using both media. Gills (evaginated gas exchangers) are the primordial respiratory organs: they are the archetypal water breathing organs. Lungs (invaginated gas exchangers) are the model air breathing organs. Bimodal (transitional) breathers occupy the water-air interface. Presentation and exposure of external (water/air) and internal (haemolymph/blood) respiratory media, features determined by geometric arrangement of the conduits, are important features for gas exchange efficiency: counter-current, cross-current, uniform pool and infinite pool designs have variably developed. PMID:12430953

  17. Bioinformatics and its applications in plant biology.

    PubMed

    Rhee, Seung Yon; Dickerson, Julie; Xu, Dong

    2006-01-01

    Bioinformatics plays an essential role in today's plant science. As the amount of data grows exponentially, there is a parallel growth in the demand for tools and methods in data management, visualization, integration, analysis, modeling, and prediction. At the same time, many researchers in biology are unfamiliar with available bioinformatics methods, tools, and databases, which could lead to missed opportunities or misinterpretation of the information. In this review, we describe some of the key concepts, methods, software packages, and databases used in bioinformatics, with an emphasis on those relevant to plant science. We also cover some fundamental issues related to biological sequence analyses, transcriptome analyses, computational proteomics, computational metabolomics, bio-ontologies, and biological databases. Finally, we explore a few emerging research topics in bioinformatics.

  18. Bioinformatics in Italy: BITS2011, the Eighth Annual Meeting of the Italian Society of Bioinformatics

    PubMed Central

    2012-01-01

    The BITS2011 meeting, held in Pisa on June 20-22, 2011, brought together more than 120 Italian researchers working in the field of Bioinformatics, as well as students in Bioinformatics, Computational Biology, Biology, Computer Sciences, and Engineering, representing a landscape of Italian bioinformatics research. This preface provides a brief overview of the meeting and introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement. PMID:22536954

  19. Protein structure prediction provides comparable performance to crystallographic structures in docking-based virtual screening.

    PubMed

    Du, Hongying; Brender, Jeffrey R; Zhang, Jian; Zhang, Yang

    2015-01-01

    Structure based virtual screening has largely been limited to protein targets for which either an experimental structure is available or a strongly homologous template exists so that a high-resolution model can be constructed. The performance of state of the art protein structure predictions in virtual screening in systems where only weakly homologous templates are available is largely untested. Using the challenging DUD database of structural decoys, we show here that even using templates with only weak sequence homology (<30% sequence identity) structural models can be constructed by I-TASSER which achieve comparable enrichment rates to using the experimental bound crystal structure in the majority of the cases studied. For 65% of the targets, the I-TASSER models, which are constructed essentially in the apo conformations, reached 70% of the virtual screening performance of using the holo-crystal structures. A correlation was observed between the success of I-TASSER in modeling the global fold and local structures in the binding pockets of the proteins versus the relative success in virtual screening. The virtual screening performance can be further improved by the recognition of chemical features of the ligand compounds. These results suggest that the combination of structure-based docking and advanced protein structure modeling methods should be a valuable approach to the large-scale drug screening and discovery studies, especially for the proteins lacking crystallographic structures.

  20. Non-structural carbohydrates in woody plants compared among laboratories.

    PubMed

    Quentin, Audrey G; Pinkard, Elizabeth A; Ryan, Michael G; Tissue, David T; Baggett, L Scott; Adams, Henry D; Maillard, Pascale; Marchand, Jacqueline; Landhäusser, Simon M; Lacointe, André; Gibon, Yves; Anderegg, William R L; Asao, Shinichi; Atkin, Owen K; Bonhomme, Marc; Claye, Caroline; Chow, Pak S; Clément-Vidal, Anne; Davies, Noel W; Dickman, L Turin; Dumbur, Rita; Ellsworth, David S; Falk, Kristen; Galiano, Lucía; Grünzweig, José M; Hartmann, Henrik; Hoch, Günter; Hood, Sharon; Jones, Joanna E; Koike, Takayoshi; Kuhlmann, Iris; Lloret, Francisco; Maestro, Melchor; Mansfield, Shawn D; Martínez-Vilalta, Jordi; Maucourt, Mickael; McDowell, Nathan G; Moing, Annick; Muller, Bertrand; Nebauer, Sergio G; Niinemets, Ülo; Palacio, Sara; Piper, Frida; Raveh, Eran; Richter, Andreas; Rolland, Gaëlle; Rosas, Teresa; Saint Joanis, Brigitte; Sala, Anna; Smith, Renee A; Sterck, Frank; Stinziano, Joseph R; Tobias, Mari; Unda, Faride; Watanabe, Makoto; Way, Danielle A; Weerasinghe, Lasantha K; Wild, Birgit; Wiley, Erin; Woodruff, David R

    2015-11-01

    Non-structural carbohydrates (NSC) in plant tissue are frequently quantified to make inferences about plant responses to environmental conditions. Laboratories publishing estimates of NSC of woody plants use many different methods to evaluate NSC. We asked whether NSC estimates in the recent literature could be quantitatively compared among studies. We also asked whether any differences among laboratories were related to the extraction and quantification methods used to determine starch and sugar concentrations. These questions were addressed by sending sub-samples collected from five woody plant tissues, which varied in NSC content and chemical composition, to 29 laboratories. Each laboratory analyzed the samples with their laboratory-specific protocols, based on recent publications, to determine concentrations of soluble sugars, starch and their sum, total NSC. Laboratory estimates differed substantially for all samples. For example, estimates for Eucalyptus globulus leaves (EGL) varied from 23 to 116 (mean = 56) mg g(-1) for soluble sugars, 6-533 (mean = 94) mg g(-1) for starch and 53-649 (mean = 153) mg g(-1) for total NSC. Mixed model analysis of variance showed that much of the variability among laboratories was unrelated to the categories we used for extraction and quantification methods (method category R(2) = 0.05-0.12 for soluble sugars, 0.10-0.33 for starch and 0.01-0.09 for total NSC). For EGL, the difference between the highest and lowest least squares means for categories in the mixed model analysis was 33 mg g(-1) for total NSC, compared with the range of laboratory estimates of 596 mg g(-1). Laboratories were reasonably consistent in their ranks of estimates among tissues for starch (r = 0.41-0.91), but less so for total NSC (r = 0.45-0.84) and soluble sugars (r = 0.11-0.83). Our results show that NSC estimates for woody plant tissues cannot be compared among laboratories. The relative changes in NSC between treatments measured within a laboratory

  1. Non-structural carbohydrates in woody plants compared among laboratories.

    PubMed

    Quentin, Audrey G; Pinkard, Elizabeth A; Ryan, Michael G; Tissue, David T; Baggett, L Scott; Adams, Henry D; Maillard, Pascale; Marchand, Jacqueline; Landhäusser, Simon M; Lacointe, André; Gibon, Yves; Anderegg, William R L; Asao, Shinichi; Atkin, Owen K; Bonhomme, Marc; Claye, Caroline; Chow, Pak S; Clément-Vidal, Anne; Davies, Noel W; Dickman, L Turin; Dumbur, Rita; Ellsworth, David S; Falk, Kristen; Galiano, Lucía; Grünzweig, José M; Hartmann, Henrik; Hoch, Günter; Hood, Sharon; Jones, Joanna E; Koike, Takayoshi; Kuhlmann, Iris; Lloret, Francisco; Maestro, Melchor; Mansfield, Shawn D; Martínez-Vilalta, Jordi; Maucourt, Mickael; McDowell, Nathan G; Moing, Annick; Muller, Bertrand; Nebauer, Sergio G; Niinemets, Ülo; Palacio, Sara; Piper, Frida; Raveh, Eran; Richter, Andreas; Rolland, Gaëlle; Rosas, Teresa; Saint Joanis, Brigitte; Sala, Anna; Smith, Renee A; Sterck, Frank; Stinziano, Joseph R; Tobias, Mari; Unda, Faride; Watanabe, Makoto; Way, Danielle A; Weerasinghe, Lasantha K; Wild, Birgit; Wiley, Erin; Woodruff, David R

    2015-11-01

    Non-structural carbohydrates (NSC) in plant tissue are frequently quantified to make inferences about plant responses to environmental conditions. Laboratories publishing estimates of NSC of woody plants use many different methods to evaluate NSC. We asked whether NSC estimates in the recent literature could be quantitatively compared among studies. We also asked whether any differences among laboratories were related to the extraction and quantification methods used to determine starch and sugar concentrations. These questions were addressed by sending sub-samples collected from five woody plant tissues, which varied in NSC content and chemical composition, to 29 laboratories. Each laboratory analyzed the samples with their laboratory-specific protocols, based on recent publications, to determine concentrations of soluble sugars, starch and their sum, total NSC. Laboratory estimates differed substantially for all samples. For example, estimates for Eucalyptus globulus leaves (EGL) varied from 23 to 116 (mean = 56) mg g(-1) for soluble sugars, 6-533 (mean = 94) mg g(-1) for starch and 53-649 (mean = 153) mg g(-1) for total NSC. Mixed model analysis of variance showed that much of the variability among laboratories was unrelated to the categories we used for extraction and quantification methods (method category R(2) = 0.05-0.12 for soluble sugars, 0.10-0.33 for starch and 0.01-0.09 for total NSC). For EGL, the difference between the highest and lowest least squares means for categories in the mixed model analysis was 33 mg g(-1) for total NSC, compared with the range of laboratory estimates of 596 mg g(-1). Laboratories were reasonably consistent in their ranks of estimates among tissues for starch (r = 0.41-0.91), but less so for total NSC (r = 0.45-0.84) and soluble sugars (r = 0.11-0.83). Our results show that NSC estimates for woody plant tissues cannot be compared among laboratories. The relative changes in NSC between treatments measured within a laboratory

  2. Bioinformatics for analysis of poxvirus genomes.

    PubMed

    Da Silva, Melissa; Upton, Chris

    2012-01-01

    In recent years, there have been numerous unprecedented technological advances in the field of molecular biology; these include DNA sequencing, mass spectrometry of proteins, and microarray analysis of mRNA transcripts. Perhaps, however, it is the area of genomics, which has now generated the complete genome sequences of more than 100 poxviruses, that has had the greatest impact on the average virology researcher because the DNA sequence data is in constant use in many different ways by almost all molecular virologists. As this data resource grows, so does the importance of the availability of databases and software tools to enable the bench virologist to work with and make use of this (valuable/expensive) DNA sequence information. Thus, providing researchers with intuitive software to first select and reformat genomics data from large databases, second, to compare/analyze genomics data, and third, to view and interpret large and complex sets of results has become pivotal in enabling progress to be made in modern virology. This chapter is directed at the bench virologist and describes the software required for a number of common bioinformatics techniques that are useful for comparing and analyzing poxvirus genomes. In a number of examples, we also highlight the Viral Orthologous Clusters database system and integrated tools that we developed for the management and analysis of complete viral genomes.

  3. No-boundary thinking in bioinformatics research

    PubMed Central

    2013-01-01

    Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT). PMID:24192339

  4. Data Mining for Grammatical Inference with Bioinformatics Criteria

    NASA Astrophysics Data System (ADS)

    López, Vivian F.; Aguilar, Ramiro; Alonso, Luis; Moreno, María N.; Corchado, Juan M.

    In this paper we describe both theoretical and practical results of a novel data mining process that combines hybrid techniques of association analysis and classical sequentiation algorithms of genomics to generate grammatical structures of a specific language. We used an application of a compilers generator system that allows the development of a practical application within the area of grammarware, where the concepts of the language analysis are applied to other disciplines, such as Bioinformatic. The tool allows the complexity of the obtained grammar to be measured automatically from textual data. A technique of incremental discovery of sequential patterns is presented to obtain simplified production rules, and compacted with bioinformatics criteria to make up a grammar.

  5. Structure, function and evolution of the gas exchangers: comparative perspectives

    PubMed Central

    Maina, JN

    2002-01-01

    Over the evolutionary continuum, animals have faced similar fundamental challenges of acquiring molecular oxygen for aerobic metabolism. Under limitations and constraints imposed by factors such as phylogeny, behaviour, body size and environment, they have responded differently in founding optimal respiratory structures. A quintessence of the aphorism that ‘necessity is the mother of invention’, gas exchangers have been inaugurated through stiff cost–benefit analyses that have evoked transaction of trade-offs and compromises. Cogent structural–functional correlations occur in constructions of gas exchangers: within and between taxa, morphological complexity and respiratory efficiency increase with metabolic capacities and oxygen needs. Highly active, small endotherms have relatively better-refined gas exchangers compared with large, inactive ectotherms. Respiratory structures have developed from the plain cell membrane of the primeval prokaryotic unicells to complex multifunctional ones ofthe modern Metazoa. Regarding the respiratory medium used to extract oxygen from, animal life has had only two choices – water or air – within the biological range of temperature and pressure the only naturally occurring respirable fluids. In rarer cases, certain animalshave adapted to using both media. Gills (evaginated gas exchangers) are the primordial respiratory organs: they are the archetypal water breathing organs. Lungs (invaginated gas exchangers) are the model air breathing organs. Bimodal (transitional) breathers occupy the water–air interface. Presentation and exposure of external (water/air) and internal (haemolymph/blood) respiratory media, features determined by geometric arrangement of the conduits, are important features for gas exchange efficiency: counter-current, cross-current, uniform pool and infinite pool designs have variably developed. PMID:12430953

  6. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    NASA Astrophysics Data System (ADS)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  7. Study on the Response Coefficient of Setback Structures Compared to Regular Moment Frame Structures

    SciTech Connect

    Mirghaderi, S. Rasoul; Khafaf, Bardia; Epackachi, Siamak

    2008-07-08

    In design practice of many countries, seismic analysis and proportioning of structures are usually based upon linear elastic analysis due to reduced seismic forces by response coefficient; R. Setback structures are one of the most popular shapes of the constructed buildings. In setback structures, the shape and proportions of the building have a major effect on distribution of earthquake forces as they work their way through the building. On the other hand, geometric configuration has a profound effect on the structural-dynamic response of a building. Therefore, when a building has irregular features, such as asymmetric in height or vertical discontinuity, the traditional assumptions used in development of seismic criteria for regular buildings may not be applicable. Inelastic seismic behavior of these types of structures seems to be quite different from the regular steel moment resisting structures in which the overall ductility is localized at beam-ends.In order to investigate the seismic behavior and estimate the Response Coefficient of those structures, nonlinear static analysis (pushover) are used for three categories of setback structures namely low rise, medium rise and high rise buildings with different setbacks in their height. The Response Coefficient are calculated and compared with those taken from regular type of moment frame structures.

  8. Regulatory bioinformatics for food and drug safety.

    PubMed

    Healy, Marion J; Tong, Weida; Ostroff, Stephen; Eichler, Hans-Georg; Patak, Alex; Neuspiel, Margaret; Deluyker, Hubert; Slikker, William

    2016-10-01

    "Regulatory Bioinformatics" strives to develop and implement a standardized and transparent bioinformatic framework to support the implementation of existing and emerging technologies in regulatory decision-making. It has great potential to improve public health through the development and use of clinically important medical products and tools to manage the safety of the food supply. However, the application of regulatory bioinformatics also poses new challenges and requires new knowledge and skill sets. In the latest Global Coalition on Regulatory Science Research (GCRSR) governed conference, Global Summit on Regulatory Science (GSRS2015), regulatory bioinformatics principles were presented with respect to global trends, initiatives and case studies. The discussion revealed that datasets, analytical tools, skills and expertise are rapidly developing, in many cases via large international collaborative consortia. It also revealed that significant research is still required to realize the potential applications of regulatory bioinformatics. While there is significant excitement in the possibilities offered by precision medicine to enhance treatments of serious and/or complex diseases, there is a clear need for further development of mechanisms to securely store, curate and share data, integrate databases, and standardized quality control and data analysis procedures. A greater understanding of the biological significance of the data is also required to fully exploit vast datasets that are becoming available. The application of bioinformatics in the microbiological risk analysis paradigm is delivering clear benefits both for the investigation of food borne pathogens and for decision making on clinically important treatments. It is recognized that regulatory bioinformatics will have many beneficial applications by ensuring high quality data, validated tools and standardized processes, which will help inform the regulatory science community of the requirements

  9. Regulatory bioinformatics for food and drug safety.

    PubMed

    Healy, Marion J; Tong, Weida; Ostroff, Stephen; Eichler, Hans-Georg; Patak, Alex; Neuspiel, Margaret; Deluyker, Hubert; Slikker, William

    2016-10-01

    "Regulatory Bioinformatics" strives to develop and implement a standardized and transparent bioinformatic framework to support the implementation of existing and emerging technologies in regulatory decision-making. It has great potential to improve public health through the development and use of clinically important medical products and tools to manage the safety of the food supply. However, the application of regulatory bioinformatics also poses new challenges and requires new knowledge and skill sets. In the latest Global Coalition on Regulatory Science Research (GCRSR) governed conference, Global Summit on Regulatory Science (GSRS2015), regulatory bioinformatics principles were presented with respect to global trends, initiatives and case studies. The discussion revealed that datasets, analytical tools, skills and expertise are rapidly developing, in many cases via large international collaborative consortia. It also revealed that significant research is still required to realize the potential applications of regulatory bioinformatics. While there is significant excitement in the possibilities offered by precision medicine to enhance treatments of serious and/or complex diseases, there is a clear need for further development of mechanisms to securely store, curate and share data, integrate databases, and standardized quality control and data analysis procedures. A greater understanding of the biological significance of the data is also required to fully exploit vast datasets that are becoming available. The application of bioinformatics in the microbiological risk analysis paradigm is delivering clear benefits both for the investigation of food borne pathogens and for decision making on clinically important treatments. It is recognized that regulatory bioinformatics will have many beneficial applications by ensuring high quality data, validated tools and standardized processes, which will help inform the regulatory science community of the requirements

  10. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    PubMed

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China.

  11. Providing web servers and training in Bioinformatics: 2010 update on the Bioinformatics Links Directory.

    PubMed

    Brazas, Michelle D; Yamada, Joseph T; Ouellette, B F Francis

    2010-07-01

    The Links Directory at Bioinformatics.ca continues its collaboration with Nucleic Acids Research to jointly publish and compile a freely accessible, online collection of tools, databases and resource materials for bioinformatics and molecular biology research. The July 2010 Web Server issue of Nucleic Acids Research adds an additional 115 web server tools and 7 updates to the directory at http://bioinformatics.ca/links_directory/, bringing the total number of servers listed close to an impressive 1500 links. The Bioinformatics Links Directory represents an excellent community resource for locating bioinformatic tools and databases to aid one's research, and in this context bioinformatic education needs and initiatives are discussed. A complete list of all links featured in this Nucleic Acids Research 2010 Web Server issue can be accessed online at http://bioinformatics.ca/links_directory/narweb2010/. The 2010 update of the Bioinformatics Links Directory, which includes the Web Server list and summaries, is also available online at the Nucleic Acids Research website, http://nar.oxfordjournals.org/.

  12. Bioinformatics process management: information flow via a computational journal

    PubMed Central

    Feagan, Lance; Rohrer, Justin; Garrett, Alexander; Amthauer, Heather; Komp, Ed; Johnson, David; Hock, Adam; Clark, Terry; Lushington, Gerald; Minden, Gary; Frost, Victor

    2007-01-01

    This paper presents the Bioinformatics Computational Journal (BCJ), a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples. PMID:18053179

  13. BioZone Exploting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    SciTech Connect

    Liu, L; Buttler, D; Paques, H; Pu, C; Critchlow

    2002-01-28

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  14. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    PubMed

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  15. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    PubMed

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  16. BioShaDock: a community driven bioinformatics shared Docker-based tools registry

    PubMed Central

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  17. The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics

    PubMed Central

    Greenhill, Simon J.; Blust, Robert; Gray, Russell D.

    2008-01-01

    Phylogenetic methods have revolutionised evolutionary biology and have recently been applied to studies of linguistic and cultural evolution. However, the basic comparative data on the languages of the world required for these analyses is often widely dispersed in hard to obtain sources. Here we outline how our Austronesian Basic Vocabulary Database (ABVD) helps remedy this situation by collating wordlists from over 500 languages into one web-accessible database. We describe the technology underlying the ABVD and discuss the benefits that an evolutionary bioinformatic approach can provide. These include facilitating computational comparative linguistic research, answering questions about human prehistory, enabling syntheses with genetic data, and safe-guarding fragile linguistic information. PMID:19204825

  18. The Austronesian Basic Vocabulary Database: from bioinformatics to lexomics.

    PubMed

    Greenhill, Simon J; Blust, Robert; Gray, Russell D

    2008-01-01

    Phylogenetic methods have revolutionised evolutionary biology and have recently been applied to studies of linguistic and cultural evolution. However, the basic comparative data on the languages of the world required for these analyses is often widely dispersed in hard to obtain sources. Here we outline how our Austronesian Basic Vocabulary Database (ABVD) helps remedy this situation by collating wordlists from over 500 languages into one web-accessible database. We describe the technology underlying the ABVD and discuss the benefits that an evolutionary bioinformatic approach can provide. These include facilitating computational comparative linguistic research, answering questions about human prehistory, enabling syntheses with genetic data, and safe-guarding fragile linguistic information.

  19. The GMOD Drupal Bioinformatic Server Framework

    PubMed Central

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  20. Bioinformatics and cancer: an essential alliance.

    PubMed

    Dopazo, Joaquín

    2006-06-01

    Modern research in cancer has been revolutionized by the introduction of new high-throughput methodologies such as DNA microarrays. Keeping the pace with these technologies, the bioinformatics offer new solutions for data analysis and, what is more important, it permits to formulate a new class of hypothesis inspired in systems biology, more oriented to blocks of functionally-related genes. Although software implementations for this new methodologies is new there are some options already available. Bioinformatic solutions for other high-throughput techniques such as array-CGH of large-scale genotyping is also revised.

  1. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure.

    PubMed

    Lima, Thamires A; Paschoal, Vitor H; Faria, Luiz F O; Ribeiro, Mauro C C; Giles, Carlos

    2016-06-14

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114](+) and [N1444](+) proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444](+) as to [N1114](+) because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114](+) cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids.

  2. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure.

    PubMed

    Lima, Thamires A; Paschoal, Vitor H; Faria, Luiz F O; Ribeiro, Mauro C C; Giles, Carlos

    2016-06-14

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114](+) and [N1444](+) proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444](+) as to [N1114](+) because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114](+) cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids. PMID:27306015

  3. How do disordered regions achieve comparable functions to structured domains?

    PubMed Central

    Latysheva, Natasha S; Flock, Tilman; Weatheritt, Robert J; Chavali, Sreenivas; Babu, M Madan

    2015-01-01

    The traditional structure to function paradigm conceives of a protein's function as emerging from its structure. In recent years, it has been established that unstructured, intrinsically disordered regions (IDRs) in proteins are equally crucial elements for protein function, regulation and homeostasis. In this review, we provide a brief overview of how IDRs can perform similar functions to structured proteins, focusing especially on the formation of protein complexes and assemblies and the mediation of regulated conformational changes. In addition to highlighting instances of such functional equivalence, we explain how differences in the biological and physicochemical properties of IDRs allow them to expand the functional and regulatory repertoire of proteins. We also discuss studies that provide insights into how mutations within functional regions of IDRs can lead to human diseases. PMID:25752799

  4. Applications of Support Vector Machines In Chemo And Bioinformatics

    NASA Astrophysics Data System (ADS)

    Jayaraman, V. K.; Sundararajan, V.

    2010-10-01

    Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.

  5. The Structure of Women's Employment in Comparative Perspective

    ERIC Educational Resources Information Center

    Pettit, Becky; Hook, Jennifer Lynn

    2005-01-01

    In this paper we analyze social survey data from 19 countries using multi-level modeling methods in an effort to synthesize structural and institutional accounts for variation in women's employment. Observed demographic characteristics show much consistency in their relationship to women's employment across countries, yet there is significant…

  6. Bioinformatics: A History of Evolution "In Silico"

    ERIC Educational Resources Information Center

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  7. Bioinformatics in Undergraduate Education: Practical Examples

    ERIC Educational Resources Information Center

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  8. Bioboxes: standardised containers for interchangeable bioinformatics software.

    PubMed

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  9. Bioboxes: standardised containers for interchangeable bioinformatics software.

    PubMed

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable. PMID:26473029

  10. Implementing bioinformatic workflows within the bioextract server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  11. "Extreme Programming" in a Bioinformatics Class

    ERIC Educational Resources Information Center

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  12. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  13. 2010 Translational bioinformatics year in review

    PubMed Central

    Miller, Katharine S

    2011-01-01

    A review of 2010 research in translational bioinformatics provides much to marvel at. We have seen notable advances in personal genomics, pharmacogenetics, and sequencing. At the same time, the infrastructure for the field has burgeoned. While acknowledging that, according to researchers, the members of this field tend to be overly optimistic, the authors predict a bright future. PMID:21672905

  14. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    EPA Science Inventory

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  15. The crystal structure of triosephosphate isomerase (TIM) from Thermotoga maritima: a comparative thermostability structural analysis of ten different TIM structures.

    PubMed

    Maes, D; Zeelen, J P; Thanki, N; Beaucamp, N; Alvarez, M; Thi, M H; Backmann, J; Martial, J A; Wyns, L; Jaenicke, R; Wierenga, R K

    1999-11-15

    The molecular mechanisms that evolution has been employing to adapt to environmental temperatures are poorly understood. To gain some further insight into this subject we solved the crystal structure of triosephosphate isomerase (TIM) from the hyperthermophilic bacterium Thermotoga maritima (TmTIM). The enzyme is a tetramer, assembled as a dimer of dimers, suggesting that the tetrameric wild-type phosphoglycerate kinase PGK-TIM fusion protein consists of a core of two TIM dimers covalently linked to 4 PGK units. The crystal structure of TmTIM represents the most thermostable TIM presently known in its 3D-structure. It adds to a series of nine known TIM structures from a wide variety of organisms, spanning the range from psychrophiles to hyperthermophiles. Several properties believed to be involved in the adaptation to different temperatures were calculated and compared for all ten structures. No sequence preferences, correlated with thermal stability, were apparent from the amino acid composition or from the analysis of the loops and secondary structure elements of the ten TIMs. A common feature for both psychrophilic and T. maritima TIM is the large number of salt bridges compared with the number found in mesophilic TIMs. In the two thermophilic TIMs, the highest amount of accessible hydrophobic surface is buried during the folding and assembly process.

  16. Navigating the changing learning landscape: perspective from bioinformatics.ca

    PubMed Central

    Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  17. Agile parallel bioinformatics workflow management using Pwrake

    PubMed Central

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  18. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects. PMID:26351170

  19. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.

  20. Component-Based Approach for Educating Students in Bioinformatics

    ERIC Educational Resources Information Center

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  1. A linked series of laboratory exercises in molecular biology utilizing bioinformatics and GFP.

    PubMed

    Medin, Carey L; Nolin, Katie L

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce the importance of bioinformatics in molecular biology. Students employed multiprimer, site-directed mutagenesis to create variant colors from a plasmid expressing green fluorescent protein (GFP). Isolated mutant plasmid from Escherichia coli showing changes in fluorescence were sequenced. Students used sequence alignment tools, protein translator tools, protein modeling, and visualization to analyze the potential effect of their mutations within the protein structure. This laboratory linked molecular techniques and bioinformatics to promote and expand the understanding of experimental results in an upper-level undergraduate laboratory course. PMID:22081550

  2. Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.

    1996-01-01

    Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.

  3. Comparing connected structures in ensemble of random fields

    NASA Astrophysics Data System (ADS)

    Rongier, Guillaume; Collon, Pauline; Renard, Philippe; Straubhaar, Julien; Sausse, Judith

    2016-10-01

    Very different connectivity patterns may arise from using different simulation methods or sets of parameters, and therefore different flow properties. This paper proposes a systematic method to compare ensemble of categorical simulations from a static connectivity point of view. The differences of static connectivity cannot always be distinguished using two point statistics. In addition, multiple-point histograms only provide a statistical comparison of patterns regardless of the connectivity. Thus, we propose to characterize the static connectivity from a set of 12 indicators based on the connected components of the realizations. Some indicators describe the spatial repartition of the connected components, others their global shape or their topology through the component skeletons. We also gather all the indicators into dissimilarity values to easily compare hundreds of realizations. Heat maps and multidimensional scaling then facilitate the dissimilarity analysis. The application to a synthetic case highlights the impact of the grid size on the connectivity and the indicators. Such impact disappears when comparing samples of the realizations with the same sizes. The method is then able to rank realizations from a referring model based on their static connectivity. This application also gives rise to more practical advices. The multidimensional scaling appears as a powerful visualization tool, but it also induces dissimilarity misrepresentations: it should always be interpreted cautiously with a look at the point position confidence. The heat map displays the real dissimilarities and is more appropriate for a detailed analysis. The comparison with a multiple-point histogram method shows the benefit of the connected components: the large-scale connectivity seems better characterized by our indicators, especially the skeleton indicators.

  4. The 2011 Bioinformatics Links Directory update: more resources, tools and databases and features to empower the bioinformatics community.

    PubMed

    Brazas, Michelle D; Yim, David S; Yamada, Joseph T; Ouellette, B F Francis

    2011-07-01

    The Bioinformatics Links Directory continues its collaboration with Nucleic Acids Research to collaboratively publish and compile a freely accessible, online collection of tools, databases and resource materials for bioinformatics and molecular biology research. The July 2011 Web Server issue of Nucleic Acids Research adds an additional 78 web server tools and 14 updates to the directory at http://bioinformatics.ca/links_directory/.

  5. The MPI Bioinformatics Toolkit for protein sequence analysis

    PubMed Central

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N.

    2006-01-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at . PMID:16845021

  6. Can bioinformatics help in the identification of moonlighting proteins?

    PubMed

    Hernández, Sergio; Calvo, Alejandra; Ferragut, Gabriela; Franco, Luís; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2014-12-01

    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place. PMID:25399591

  7. The MPI Bioinformatics Toolkit for protein sequence analysis.

    PubMed

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N

    2006-07-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at http://toolkit.tuebingen.mpg.de.

  8. Is racism dead? Comparing (expressive) means and (structural equation) models.

    PubMed

    Leach, C W; Peng, T R; Volckens, J

    2000-09-01

    Much scholarship suggests that racism--belief in out-group inferiority--is unrelated to contemporary attitudes. Purportedly, a new form of racism, one which relies upon a belief in cultural difference, has become a more acceptable basis for such attitudes. The authors argue that an appropriate empirical assessment of racism (both 'old' and 'new') depends upon (1) clear conceptualization and operationalization, and (2) attention to both mean-level expression and explanatory value in structural equation models. This study assessed the endorsement of racism and belief in cultural difference as well as their association with a measure of general attitude in a secondary analysis of parallel representative surveys of attitudes toward different ethnic out-groups in France, The Netherlands, Western Germany and Britain (N = 3242; see Reif & Melich, 1991). For six of the seven out-group targets, racism was strongly related to ethnic majority attitudes, despite low mean-level endorsement. In a pattern consistent with a 'new', indirect racism, the relationship between British racism and attitudes toward Afro-Caribbeans was mediated by belief in cultural difference.

  9. Hospital profitability and capital structure: a comparative analysis.

    PubMed Central

    Valvona, J; Sloan, F A

    1988-01-01

    This article compares the financial performance of hospitals by ownership type and of five publicly traded hospital companies with other industries, using such indicators as profit margins, return on equity (ROE) and total capitalization, and debt-to-equity ratios. We also examine stock returns to investors for the five hospital companies versus other industries, as well as the relative roles of debt and equity in new financing. Investor-owned hospitals had substantially greater margins and ROE than did other hospital types. In 1982, investor-owned chain hospitals had a ROE of 26 percent, 18 points above the average for all hospitals. Stock returns on the five selected hospital companies were more than twice as large as returns on other industries between 1972 and 1983. However, after 1983, returns for these companies fell dramatically in absolute terms and relative to other industries. We also found investor-owned hospitals to be much more highly levered than their government and voluntary counterparts, and more highly levered than other industries as well. PMID:3403274

  10. BioGPS: The Music for the Chemo- and Bioinformatics Walzer.

    PubMed

    Siragusa, Lydia; Spyrakis, Francesca; Goracci, Laura; Cross, Simon; Cruciani, Gabriele

    2014-06-01

    Identifying cross-relationships among protein binding sites is becoming increasingly important in the chemo- and bioinformatics field; indeed, protein structural similarity might provide the right answer to a number of questions including Is a drug repurposable for another target? What is the molecular mechanism of a drug side-effect? How can we improve the ligand selectivity? The comparison of protein binding sites in terms of their three-dimensional structure molecular interaction fields can be a useful technique to approach all of these problems. Here, we report a semi-automated method for comparing and clustering protein pockets, called BioGPS, that combines the GRID Molecular Interactions Fields (MIFs) with FLAP pharmacophoric fingerprints. BioGPS identifies and compares protein binding sites by aligning them each other and directly comparing their MIFs. The strengths of this approach are that it is MIF-based, and therefore describes molecular interactions from a ligand perspective, and it is independent of protein superposition or sequence alignment. This approach enables protein-protein virtual screening (drug repurposing, polypharmacology, off-target effects), and also clustering to relate sequence-based similarities to structure-based differences among protein binding sites. PMID:27485981

  11. Bioinformatic characterization of plant networks

    SciTech Connect

    McDermott, Jason E.; Samudrala, Ram

    2008-06-30

    Cells and organisms are governed by networks of interactions, genetic, physical and metabolic. Large-scale experimental studies of interactions between components of biological systems have been performed for a variety of eukaryotic organisms. However, there is a dearth of such data for plants. Computational methods for prediction of relationships between proteins, primarily based on comparative genomics, provide a useful systems-level view of cellular functioning and can be used to extend information about other eukaryotes to plants. We have predicted networks for Arabidopsis thaliana, Oryza sativa indica and japonica and several plant pathogens using the Bioverse (http://bioverse.compbio.washington.edu) and show that they are similar to experimentally-derived interaction networks. Predicted interaction networks for plants can be used to provide novel functional annotations and predictions about plant phenotypes and aid in rational engineering of biosynthesis pathways.

  12. Bioinformatics approaches to cancer gene discovery.

    PubMed

    Narayanan, Ramaswamy

    2007-01-01

    The Cancer Gene Anatomy Project (CGAP) database of the National Cancer Institute has thousands of known and novel expressed sequence tags (ESTs). These ESTs, derived from diverse normal and tumor cDNA libraries, offer an attractive starting point for cancer gene discovery. Data-mining the CGAP database led to the identification of ESTs that were predicted to be specific to select solid tumors. Two genes from these efforts were taken to proof of concept for diagnostic and therapeutics indications of cancer. Microarray technology was used in conjunction with bioinformatics to understand the mechanism of one of the targets discovered. These efforts provide an example of gene discovery by using bioinformatics approaches. The strengths and weaknesses of this approach are discussed in this review.

  13. Machine learning: an indispensable tool in bioinformatics.

    PubMed

    Inza, Iñaki; Calvo, Borja; Armañanzas, Rubén; Bengoetxea, Endika; Larrañaga, Pedro; Lozano, José A

    2010-01-01

    The increase in the number and complexity of biological databases has raised the need for modern and powerful data analysis tools and techniques. In order to fulfill these requirements, the machine learning discipline has become an everyday tool in bio-laboratories. The use of machine learning techniques has been extended to a wide spectrum of bioinformatics applications. It is broadly used to investigate the underlying mechanisms and interactions between biological molecules in many diseases, and it is an essential tool in any biomarker discovery process. In this chapter, we provide a basic taxonomy of machine learning algorithms, and the characteristics of main data preprocessing, supervised classification, and clustering techniques are shown. Feature selection, classifier evaluation, and two supervised classification topics that have a deep impact on current bioinformatics are presented. We make the interested reader aware of a set of popular web resources, open source software tools, and benchmarking data repositories that are frequently used by the machine learning community. PMID:19957143

  14. Bioinformatics Pipeline for Transcriptome Sequencing Analysis.

    PubMed

    Djebali, Sarah; Wucher, Valentin; Foissac, Sylvain; Hitte, Christophe; Corre, Evan; Derrien, Thomas

    2017-01-01

    The development of High Throughput Sequencing (HTS) for RNA profiling (RNA-seq) has shed light on the diversity of transcriptomes. While RNA-seq is becoming a de facto standard for monitoring the population of expressed transcripts in a given condition at a specific time, processing the huge amount of data it generates requires dedicated bioinformatics programs. Here, we describe a standard bioinformatics protocol using state-of-the-art tools, the STAR mapper to align reads onto a reference genome, Cufflinks to reconstruct the transcriptome, and RSEM to quantify expression levels of genes and transcripts. We present the workflow using human transcriptome sequencing data from two biological replicates of the K562 cell line produced as part of the ENCODE3 project. PMID:27662878

  15. A toolbox for developing bioinformatics software

    PubMed Central

    Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.

    2012-01-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  16. Discovery and Classification of Bioinformatics Web Services

    SciTech Connect

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  17. [Applied problems of mathematical biology and bioinformatics].

    PubMed

    Lakhno, V D

    2011-01-01

    Mathematical biology and bioinformatics represent a new and rapidly progressing line of investigations which emerged in the course of work on the project "Human genome". The main applied problems of these sciences are grug design, patient-specific medicine and nanobioelectronics. It is shown that progress in the technology of mass sequencing of the human genome has set the stage for starting the national program on patient-specific medicine.

  18. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  19. An active registry for bioinformatics web services

    PubMed Central

    Pettifer, S.; Thorne, D.; McDermott, P.; Attwood, T.; Baran, J.; Bryne, J. C.; Hupponen, T.; Mowbray, D.; Vriend, G.

    2009-01-01

    Summary: The EMBRACE Registry is a web portal that collects and monitors web services according to test scripts provided by the their administrators. Users are able to search for, rank and annotate services, enabling them to select the most appropriate working service for inclusion in their bioinformatics analysis tasks. Availability and implementation: Web site implemented with PHP, Python, MySQL and Apache, with all major browsers supported. (www.embraceregistry.net) Contact: steve.pettifer@manchester.ac.uk PMID:19460889

  20. Broader incorporation of bioinformatics in education: opportunities and challenges.

    PubMed

    Cummings, Michael P; Temple, Glena G

    2010-11-01

    The major opportunities for broader incorporation of bioinformatics in education can be placed into three general categories: general applicability of bioinformatics in life science and related curricula; inherent fit of bioinformatics for promoting student learning in most biology programs; and the general experience and associated comfort students have with computers and technology. Conversely, the major challenges for broader incorporation of bioinformatics in education can be placed into three general categories: required infrastructure and logistics; instructor knowledge of bioinformatics and continuing education; and the breadth of bioinformatics, and the diversity of students and educational objectives. Broader incorporation of bioinformatics at all education levels requires overcoming the challenges to using transformative computer-requiring learning activities, assisting faculty in collecting assessment data on mastery of student learning outcomes, as well as creating more faculty development opportunities that span diverse skill levels, with an emphasis placed on providing resource materials that are kept up-to-date as the field and tools change.

  1. Bioinformatics for transporter pharmacogenomics and systems biology: data integration and modeling with UML.

    PubMed

    Yan, Qing

    2010-01-01

    Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields.

  2. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology.

    PubMed

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-12-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  3. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    PubMed

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease.

  4. A library-based bioinformatics services program*

    PubMed Central

    Yarfitz, Stuart; Ketchell, Debra S.

    2000-01-01

    Support for molecular biology researchers has been limited to traditional library resources and services in most academic health sciences libraries. The University of Washington Health Sciences Libraries have been providing specialized services to this user community since 1995. The library recruited a Ph.D. biologist to assess the molecular biological information needs of researchers and design strategies to enhance library resources and services. A survey of laboratory research groups identified areas of greatest need and led to the development of a three-pronged program: consultation, education, and resource development. Outcomes of this program include bioinformatics consultation services, library-based and graduate level courses, networking of sequence analysis tools, and a biological research Web site. Bioinformatics clients are drawn from diverse departments and include clinical researchers in need of tools that are not readily available outside of basic sciences laboratories. Evaluation and usage statistics indicate that researchers, regardless of departmental affiliation or position, require support to access molecular biology and genetics resources. Centralizing such services in the library is a natural synergy of interests and enhances the provision of traditional library resources. Successful implementation of a library-based bioinformatics program requires both subject-specific and library and information technology expertise. PMID:10658962

  5. Bioinformatics on the Cloud Computing Platform Azure

    PubMed Central

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  6. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  7. Bringing Web 2.0 to bioinformatics.

    PubMed

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  8. Application of Bioinformatics in Chronobiology Research

    PubMed Central

    Lopes, Robson da Silva; Resende, Nathalia Maria; Honorio-França, Adenilda Cristina; França, Eduardo Luzía

    2013-01-01

    Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research. PMID:24187519

  9. Entropyology: the application of bioinformatics and data modeling to digital virus and malware recognition

    NASA Astrophysics Data System (ADS)

    Jaenisch, Holger M.; Handley, James W.

    2010-04-01

    Malware are analogs of viruses. Viruses are comprised of large numbers of polypeptide proteins. The shape and function of the protein strands determines the functionality of the segment, similar to a subroutine in malware. The full combination of subroutines is the malware organism, in analogous fashion as a collection of polypeptides forms protein structures that are information bearing. We propose to apply the methods of Bioinformatics to analyze malware to provide a rich feature set for creating a unique and novel detection and classification scheme that is originally applied to Bioinformatics amino acid sequencing. Our proposed methods enable real time in situ (in contrast to in vivo) detection applications.

  10. Advancing standards for bioinformatics activities: persistence, reproducibility, disambiguation and Minimum Information About a Bioinformatics investigation (MIABi).

    PubMed

    Tan, Tin Wee; Tong, Joo Chuan; Khan, Asif M; de Silva, Mark; Lim, Kuan Siong; Ranganathan, Shoba

    2010-12-02

    The 2010 International Conference on Bioinformatics, InCoB2010, which is the annual conference of the Asia-Pacific Bioinformatics Network (APBioNet) has agreed to publish conference papers in compliance with the proposed Minimum Information about a Bioinformatics investigation (MIABi), proposed in June 2009. Authors of the conference supplements in BMC Bioinformatics, BMC Genomics and Immunome Research have consented to cooperate in this process, which will include the procedures described herein, where appropriate, to ensure data and software persistence and perpetuity, database and resource re-instantiability and reproducibility of results, author and contributor identity disambiguation and MIABi-compliance. Wherever possible, datasets and databases will be submitted to depositories with standardized terminologies. As standards are evolving, this process is intended as a prelude to the 100 BioDatabases (BioDB100) initiative whereby APBioNet collaborators will contribute exemplar databases to demonstrate the feasibility of standards-compliance and participate in refining the process for peer-review of such publications and validation of scientific claims and standards compliance. This testbed represents another step in advancing standards-based processes in the bioinformatics community which is essential to the growing interoperability of biological data, information, knowledge and computational resources.

  11. Quantum Bio-Informatics IV

    NASA Astrophysics Data System (ADS)

    Accardi, Luigi; Freudenberg, Wolfgang; Ohya, Masanori

    2011-01-01

    .Use of cryptographic ideas to interpret biological phenomena (and vice versa) / M. Regoli -- Discrete approximation to operators in white noise analysis / Si Si -- Bogoliubov type equations via infinite-dimensional equations for measures / V. V. Kozlov and O. G. Smolyanov -- Analysis of several categorical data using measure of proportional reduction in variation / K. Yamamoto ... [et al.] -- The electron reservoir hypothesis for two-dimensional electron systems / K. Yamada ... [et al.] -- On the correspondence between Newtonian and functional mechanics / E. V. Piskovskiy and I. V. Volovich -- Quantile-quantile plots: An approach for the inter-species comparison of promoter architecture in eukaryotes / K. Feldmeier ... [et al.] -- Entropy type complexities in quantum dynamical processes / N. Watanabe -- A fair sampling test for Ekert protocol / G. Adenier, A. Yu. Khrennikov and N. Watanabe -- Brownian dynamics simulation of macromolecule diffusion in a protocell / T. Ando and J. Skolnick -- Signaling network of environmental sensing and adaptation in plants: Key roles of calcium ion / K. Kuchitsu and T. Kurusu -- NetzCope: A tool for displaying and analyzing complex networks / M. J. Barber, L. Streit and O. Strogan -- Study of HIV-1 evolution by coding theory and entropic chaos degree / K. Sato -- The prediction of botulinum toxin structure based on in silico and in vitro analysis / T. Suzuki and S. Miyazaki -- On the mechanism of D-wave high T[symbol] superconductivity by the interplay of Jahn-Teller physics and Mott physics / H. Ushio, S. Matsuno and H. Kamimura.

  12. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  13. Evolving Strategies for the Incorporation of Bioinformatics within the Undergraduate Cell Biology Curriculum

    ERIC Educational Resources Information Center

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…

  14. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    ERIC Educational Resources Information Center

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  15. Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction

    PubMed Central

    2011-01-01

    Background Many bioinformatics tools for RNA secondary structure analysis are based on a thermodynamic model of RNA folding. They predict a single, "optimal" structure by free energy minimization, they enumerate near-optimal structures, they compute base pair probabilities and dot plots, representative structures of different abstract shapes, or Boltzmann probabilities of structures and shapes. Although all programs refer to the same physical model, they implement it with considerable variation for different tasks, and little is known about the effects of heuristic assumptions and model simplifications used by the programs on the outcome of the analysis. Results We extract four different models of the thermodynamic folding space which underlie the programs RNAFOLD, RNASHAPES, and RNASUBOPT. Their differences lie within the details of the energy model and the granularity of the folding space. We implement probabilistic shape analysis for all models, and introduce the shape probability shift as a robust measure of model similarity. Using four data sets derived from experimentally solved structures, we provide a quantitative evaluation of the model differences. Conclusions We find that search space granularity affects the computed shape probabilities less than the over- or underapproximation of free energy by a simplified energy model. Still, the approximations perform similar enough to implementations of the full model to justify their continued use in settings where computational constraints call for simpler algorithms. On the side, we observe that the rarely used level 2 shapes, which predict the complete arrangement of helices, multiloops, internal loops and bulges, include the "true" shape in a rather small number of predicted high probability shapes. This calls for an investigation of new strategies to extract high probability members from the (very large) level 2 shape space of an RNA sequence. We provide implementations of all four models, written in a

  16. Shared bioinformatics databases within the Unipro UGENE platform.

    PubMed

    Protsyuk, Ivan V; Grekhov, German A; Tiunov, Alexey V; Fursov, Mikhail Y

    2015-01-01

    Unipro UGENE is an open-source bioinformatics toolkit that integrates popular tools along with original instruments for molecular biologists within a unified user interface. Nowadays, most bioinformatics desktop applications, including UGENE, make use of a local data model while processing different types of data. Such an approach causes an inconvenience for scientists working cooperatively and relying on the same data. This refers to the need of making multiple copies of certain files for every workplace and maintaining synchronization between them in case of modifications. Therefore, we focused on delivering a collaborative work into the UGENE user experience. Currently, several UGENE installations can be connected to a designated shared database and users can interact with it simultaneously. Such databases can be created by UGENE users and be used at their discretion. Objects of each data type, supported by UGENE such as sequences, annotations, multiple alignments, etc., can now be easily imported from or exported to a remote storage. One of the main advantages of this system, compared to existing ones, is the almost simultaneous access of client applications to shared data regardless of their volume. Moreover, the system is capable of storing millions of objects. The storage itself is a regular database server so even an inexpert user is able to deploy it. Thus, UGENE may provide access to shared data for users located, for example, in the same laboratory or institution. UGENE is available at: http://ugene.net/download.html. PMID:26527191

  17. Microbial bioinformatics for food safety and production

    PubMed Central

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel

    2016-01-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput ‘omics’ technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety. PMID:26082168

  18. Mobyle: a new full web bioinformatics framework

    PubMed Central

    Néron, Bertrand; Ménager, Hervé; Maufrais, Corinne; Joly, Nicolas; Maupetit, Julien; Letort, Sébastien; Carrere, Sébastien; Tuffery, Pierre; Letondal, Catherine

    2009-01-01

    Motivation: For the biologist, running bioinformatics analyses involves a time-consuming management of data and tools. Users need support to organize their work, retrieve parameters and reproduce their analyses. They also need to be able to combine their analytic tools using a safe data flow software mechanism. Finally, given that scientific tools can be difficult to install, it is particularly helpful for biologists to be able to use these tools through a web user interface. However, providing a web interface for a set of tools raises the problem that a single web portal cannot offer all the existing and possible services: it is the user, again, who has to cope with data copy among a number of different services. A framework enabling portal administrators to build a network of cooperating services would therefore clearly be beneficial. Results: We have designed a system, Mobyle, to provide a flexible and usable Web environment for defining and running bioinformatics analyses. It embeds simple yet powerful data management features that allow the user to reproduce analyses and to combine tools using a hierarchical typing system. Mobyle offers invocation of services distributed over remote Mobyle servers, thus enabling a federated network of curated bioinformatics portals without the user having to learn complex concepts or to install sophisticated software. While being focused on the end user, the Mobyle system also addresses the need, for the bioinfomatician, to automate remote services execution: PlayMOBY is a companion tool that automates the publication of BioMOBY web services, using Mobyle program definitions. Availability: The Mobyle system is distributed under the terms of the GNU GPLv2 on the project web site (http://bioweb2.pasteur.fr/projects/mobyle/). It is already deployed on three servers: http://mobyle.pasteur.fr, http://mobyle.rpbs.univ-paris-diderot.fr and http://lipm-bioinfo.toulouse.inra.fr/Mobyle. The PlayMOBY companion is distributed under the

  19. Critical Issues in Bioinformatics and Computing

    PubMed Central

    Kesh, Someswa; Raghupathi, Wullianallur

    2004-01-01

    This article provides an overview of the field of bioinformatics and its implications for the various participants. Next-generation issues facing developers (programmers), users (molecular biologists), and the general public (patients) who would benefit from the potential applications are identified. The goal is to create awareness and debate on the opportunities (such as career paths) and the challenges such as privacy that arise. A triad model of the participants' roles and responsibilities is presented along with the identification of the challenges and possible solutions. PMID:18066389

  20. Translational Bioinformatics: Past, Present, and Future.

    PubMed

    Tenenbaum, Jessica D

    2016-02-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contextualization of the term TBI, describes the discipline's brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field.

  1. Translational Bioinformatics: Past, Present, and Future

    PubMed Central

    Tenenbaum, Jessica D.

    2016-01-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contextualization of the term TBI, describes the discipline’s brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field. PMID:26876718

  2. Multiobjective optimization in bioinformatics and computational biology.

    PubMed

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  3. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    NASA Technical Reports Server (NTRS)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  4. Comparative modeling: the state of the art and protein drug target structure prediction.

    PubMed

    Liu, Tianyun; Tang, Grace W; Capriotti, Emidio

    2011-07-01

    The goal of computational protein structure prediction is to provide three-dimensional (3D) structures with resolution comparable to experimental results. Comparative modeling, which predicts the 3D structure of a protein based on its sequence similarity to homologous structures, is the most accurate computational method for structure prediction. In the last two decades, significant progress has been made on comparative modeling methods. Using the large number of protein structures deposited in the Protein Data Bank (~65,000), automatic prediction pipelines are generating a tremendous number of models (~1.9 million) for sequences whose structures have not been experimentally determined. Accurate models are suitable for a wide range of applications, such as prediction of protein binding sites, prediction of the effect of protein mutations, and structure-guided virtual screening. In particular, comparative modeling has enabled structure-based drug design against protein targets with unknown structures. In this review, we describe the theoretical basis of comparative modeling, the available automatic methods and databases, and the algorithms to evaluate the accuracy of predicted structures. Finally, we discuss relevant applications in the prediction of important drug target proteins, focusing on the G protein-coupled receptor (GPCR) and protein kinase families.

  5. Teaching the ABCs of bioinformatics: a brief introduction to the Applied Bioinformatics Course

    PubMed Central

    2014-01-01

    With the development of the Internet and the growth of online resources, bioinformatics training for wet-lab biologists became necessary as a part of their education. This article describes a one-semester course ‘Applied Bioinformatics Course’ (ABC, http://abc.cbi.pku.edu.cn/) that the author has been teaching to biological graduate students at the Peking University and the Chinese Academy of Agricultural Sciences for the past 13 years. ABC is a hands-on practical course to teach students to use online bioinformatics resources to solve biological problems related to their ongoing research projects in molecular biology. With a brief introduction to the background of the course, detailed information about the teaching strategies of the course are outlined in the ‘How to teach’ section. The contents of the course are briefly described in the ‘What to teach’ section with some real examples. The author wishes to share his teaching experiences and the online teaching materials with colleagues working in bioinformatics education both in local and international universities. PMID:24008274

  6. Teaching the ABCs of bioinformatics: a brief introduction to the Applied Bioinformatics Course.

    PubMed

    Luo, Jingchu

    2014-11-01

    With the development of the Internet and the growth of online resources, bioinformatics training for wet-lab biologists became necessary as a part of their education. This article describes a one-semester course 'Applied Bioinformatics Course' (ABC, http://abc.cbi.pku.edu.cn/) that the author has been teaching to biological graduate students at the Peking University and the Chinese Academy of Agricultural Sciences for the past 13 years. ABC is a hands-on practical course to teach students to use online bioinformatics resources to solve biological problems related to their ongoing research projects in molecular biology. With a brief introduction to the background of the course, detailed information about the teaching strategies of the course are outlined in the 'How to teach' section. The contents of the course are briefly described in the 'What to teach' section with some real examples. The author wishes to share his teaching experiences and the online teaching materials with colleagues working in bioinformatics education both in local and international universities.

  7. The potential of translational bioinformatics approaches for pharmacology research.

    PubMed

    Li, Lang

    2015-10-01

    The field of bioinformatics has allowed the interpretation of massive amounts of biological data, ushering in the era of 'omics' to biomedical research. Its potential impact on pharmacology research is enormous and it has shown some emerging successes. A full realization of this potential, however, requires standardized data annotation for large health record databases and molecular data resources. Improved standardization will further stimulate the development of system pharmacology models, using translational bioinformatics methods. This new translational bioinformatics paradigm is highly complementary to current pharmacological research fields, such as personalized medicine, pharmacoepidemiology and drug discovery. In this review, I illustrate the application of transformational bioinformatics to research in numerous pharmacology subdisciplines.

  8. Translational Bioinformatics: Linking the Molecular World to the Clinical World

    PubMed Central

    Altman, RB

    2014-01-01

    Translational bioinformatics represents the union of translational medicine and bioinformatics. Translational medicine moves basic biological discoveries from the research bench into the patient-care setting and uses clinical observations to inform basic biology. It focuses on patient care, including the creation of new diagnostics, prognostics, prevention strategies, and therapies based on biological discoveries. Bioinformatics involves algorithms to represent, store, and analyze basic biological data, including DNA sequence, RNA expression, and protein and small-molecule abundance within cells. Translational bioinformatics spans these two fields; it involves the development of algorithms to analyze basic molecular and cellular data with an explicit goal of affecting clinical care. PMID:22549287

  9. Data Compression Concepts and Algorithms and their Applications to Bioinformatics

    PubMed Central

    Nalbantog̃lu, Ö. U.; Russell, D.J.; Sayood, K.

    2009-01-01

    Data compression at its base is concerned with how information is organized in data. Understanding this organization can lead to efficient ways of representing the information and hence data compression. In this paper we review the ways in which ideas and approaches fundamental to the theory and practice of data compression have been used in the area of bioinformatics. We look at how basic theoretical ideas from data compression, such as the notions of entropy, mutual information, and complexity have been used for analyzing biological sequences in order to discover hidden patterns, infer phylogenetic relationships between organisms and study viral populations. Finally, we look at how inferred grammars for biological sequences have been used to uncover structure in biological sequences. PMID:20157640

  10. Bioinformatic Primer for Clinical and Translational Science

    PubMed Central

    Faustino, Randolph S.; Chiriac, Anca; Terzic, Andre

    2009-01-01

    The advent of high-throughput technologies has accelerated generation and expansion of genomic, transcriptomic, and proteomic data. Acquisition of high-dimensional datasets requires archival systems that permit efficiency of storage and retrieval, and so, multiple electronic repositories have been initiated and maintained to meet this demand. Bioinformatic science has evolved, from these intricate bodies of dynamically updated information and the tools to manage them, as a necessity to harness and decipher the inherent complexity of high-volume data. Large datasets are associated with a variable degree of stochastic noise that contributes to the balance of an ordered, multistable state with the capacity to evolve in response to stimulus, thus exhibiting a hallmark feature of biological criticality. In this context, the network theory has become an invaluable tool to map relationships that integrate discrete elements that collectively direct global function within a particular –omic category, and indeed, the prioritized focus on the functional whole of the genomic, transcriptomic, or proteomic strata over single molecules is a primary tenet of systems biology analyses. This new biology perspective allows inspection and prediction of disease conditions, not limited to a monogenic challenge, but as a combination of individualized molecular permutations acting in concert to effect a phenotypic outcome. Bioinformatic integration of multidimensional data within and between biological layers thus harbors the potential to identify unique biological signatures, providing an enabling platform for advances in clinical and translational science. PMID:19690627

  11. Tools and collaborative environments for bioinformatics research

    PubMed Central

    Giugno, Rosalba; Pulvirenti, Alfredo

    2011-01-01

    Advanced research requires intensive interaction among a multitude of actors, often possessing different expertise and usually working at a distance from each other. The field of collaborative research aims to establish suitable models and technologies to properly support these interactions. In this article, we first present the reasons for an interest of Bioinformatics in this context by also suggesting some research domains that could benefit from collaborative research. We then review the principles and some of the most relevant applications of social networking, with a special attention to networks supporting scientific collaboration, by also highlighting some critical issues, such as identification of users and standardization of formats. We then introduce some systems for collaborative document creation, including wiki systems and tools for ontology development, and review some of the most interesting biological wikis. We also review the principles of Collaborative Development Environments for software and show some examples in Bioinformatics. Finally, we present the principles and some examples of Learning Management Systems. In conclusion, we try to devise some of the goals to be achieved in the short term for the exploitation of these technologies. PMID:21984743

  12. Bioinformatics for cancer immunology and immunotherapy.

    PubMed

    Charoentong, Pornpimol; Angelova, Mihaela; Efremova, Mirjana; Gallasch, Ralf; Hackl, Hubert; Galon, Jerome; Trajanoski, Zlatko

    2012-11-01

    Recent mechanistic insights obtained from preclinical studies and the approval of the first immunotherapies has motivated increasing number of academic investigators and pharmaceutical/biotech companies to further elucidate the role of immunity in tumor pathogenesis and to reconsider the role of immunotherapy. Additionally, technological advances (e.g., next-generation sequencing) are providing unprecedented opportunities to draw a comprehensive picture of the tumor genomics landscape and ultimately enable individualized treatment. However, the increasing complexity of the generated data and the plethora of bioinformatics methods and tools pose considerable challenges to both tumor immunologists and clinical oncologists. In this review, we describe current concepts and future challenges for the management and analysis of data for cancer immunology and immunotherapy. We first highlight publicly available databases with specific focus on cancer immunology including databases for somatic mutations and epitope databases. We then give an overview of the bioinformatics methods for the analysis of next-generation sequencing data (whole-genome and exome sequencing), epitope prediction tools as well as methods for integrative data analysis and network modeling. Mathematical models are powerful tools that can predict and explain important patterns in the genetic and clinical progression of cancer. Therefore, a survey of mathematical models for tumor evolution and tumor-immune cell interaction is included. Finally, we discuss future challenges for individualized immunotherapy and suggest how a combined computational/experimental approaches can lead to new insights into the molecular mechanisms of cancer, improved diagnosis, and prognosis of the disease and pinpoint novel therapeutic targets.

  13. ExPASy: SIB bioinformatics resource portal.

    PubMed

    Artimo, Panu; Jonnalagedda, Manohar; Arnold, Konstantin; Baratin, Delphine; Csardi, Gabor; de Castro, Edouard; Duvaud, Séverine; Flegel, Volker; Fortier, Arnaud; Gasteiger, Elisabeth; Grosdidier, Aurélien; Hernandez, Céline; Ioannidis, Vassilios; Kuznetsov, Dmitry; Liechti, Robin; Moretti, Sébastien; Mostaguir, Khaled; Redaschi, Nicole; Rossier, Grégoire; Xenarios, Ioannis; Stockinger, Heinz

    2012-07-01

    ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a 'decentralized' way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across 'selected' resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy.

  14. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

    PubMed

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

    2015-04-01

    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  15. Development of Bioinformatics Pipeline for Analyzing Clinical Pediatric NGS Data

    PubMed Central

    Crowgey, Erin L.; Kolb, Anders; Wu, Cathy H.

    2015-01-01

    Using an Illumina exome sequencing dataset generated from pediatric Acute Myeloid Leukemia patients (AML; type FLT3/ITD+) a comprehensive bioinformatics pipeline was developed to aid in a better clinical understanding of the genetic data associated with the clinical phenotype. The pipeline starts with raw next generation sequencing reads and using both publicly available resources and custom scripts, analyzes the genomic data for variants associated with pediatric AML. By incorporating functional information such as Gene Ontology annotation and protein-protein interactions, the methodology prioritizes genomic variants and returns disease specific results and knowledge maps. Furthermore, it compares the somatic mutations at diagnosis with the somatic mutations at relapse and outputs variants and functional annotations that are specific for the relapse state. PMID:26306272

  16. Developing sustainable software solutions for bioinformatics by the " Butterfly" paradigm.

    PubMed

    Ahmed, Zeeshan; Zeeshan, Saman; Dandekar, Thomas

    2014-01-01

    Software design and sustainable software engineering are essential for the long-term development of bioinformatics software. Typical challenges in an academic environment are short-term contracts, island solutions, pragmatic approaches and loose documentation. Upcoming new challenges are big data, complex data sets, software compatibility and rapid changes in data representation. Our approach to cope with these challenges consists of iterative intertwined cycles of development (" Butterfly" paradigm) for key steps in scientific software engineering. User feedback is valued as well as software planning in a sustainable and interoperable way. Tool usage should be easy and intuitive. A middleware supports a user-friendly Graphical User Interface (GUI) as well as a database/tool development independently. We validated the approach of our own software development and compared the different design paradigms in various software solutions.

  17. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN

    PubMed Central

    2010-01-01

    Background Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. Results VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Conclusions Bioinformatics curation and ontological

  18. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    PubMed Central

    2011-01-01

    Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive

  19. Is there room for ethics within bioinformatics education?

    PubMed

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  20. Assessment of a Bioinformatics across Life Science Curricula Initiative

    ERIC Educational Resources Information Center

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  1. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    ERIC Educational Resources Information Center

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  2. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    PubMed

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  3. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    PubMed

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  4. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    PubMed

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education. PMID:21036947

  5. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    ERIC Educational Resources Information Center

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  6. Wrapping and interoperating bioinformatics resources using CORBA.

    PubMed

    Stevens, R; Miller, C

    2000-02-01

    Bioinformaticians seeking to provide services to working biologists are faced with the twin problems of distribution and diversity of resources. Bioinformatics databases are distributed around the world and exist in many kinds of storage forms, platforms and access paradigms. To provide adequate services to biologists, these distributed and diverse resources have to interoperate seamlessly within single applications. The Common Object Request Broker Architecture (CORBA) offers one technical solution to these problems. The key component of CORBA is its use of object orientation as an intermediate form to translate between different representations. This paper concentrates on an explanation of object orientation and how it can be used to overcome the problems of distribution and diversity by describing the interfaces between objects.

  7. Bioinformatics Analysis of Estrogen-Responsive Genes.

    PubMed

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  8. 1. COMPARATIVE VIEW OF THREE RESIDENTIAL STRUCTURES, 203 1/2 TO ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    1. COMPARATIVE VIEW OF THREE RESIDENTIAL STRUCTURES, 203 1/2 TO 205 WHITTIER, BUILT IN 1905, NORTH FRONTS LOOKING SOUTH - Joseph & Emma May Hoover House, 203 1/2 West Whittier Avenue, Altoona, Blair County, PA

  9. 4273π: Bioinformatics education on low cost ARM hardware

    PubMed Central

    2013-01-01

    Background Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. Results We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012–2013. Conclusions 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost. PMID:23937194

  10. Multilevel Structural Equation Models for the Analysis of Comparative Data on Educational Performance

    ERIC Educational Resources Information Center

    Goldstein, Harvey; Bonnet, Gerard; Rocher, Thierry

    2007-01-01

    The Programme for International Student Assessment comparative study of reading performance among 15-year-olds is reanalyzed using statistical procedures that allow the full complexity of the data structures to be explored. The article extends existing multilevel factor analysis and structural equation models and shows how this can extract richer…

  11. Comparative analysis of secondary structure of insect mitochondrial small subunit ribosomal RNA using maximum weighted matching.

    PubMed

    Page, R D

    2000-10-15

    Comparative analysis is the preferred method of inferring RNA secondary structure, but its use requires considerable expertise and manual effort. As the importance of secondary structure for accurate sequence alignment and phylogenetic analysis becomes increasingly realised, the need for secondary structure models for diverse taxonomic groups becomes more pressing. The number of available structures bears little relation to the relative diversity or importance of the different taxonomic groups. Insects, for example, comprise the largest group of animals and yet are very poorly represented in secondary structure databases. This paper explores the utility of maximum weighted matching (MWM) to help automate the process of comparative analysis by inferring secondary structure for insect mitochondrial small subunit (12S) rRNA sequences. By combining information on correlated changes in substitutions and helix dot plots, MWM can rapidly generate plausible models of secondary structure. These models can be further refined using standard comparative techniques. This paper presents a secondary structure model for insect 12S rRNA based on an alignment of 225 insect sequences and an alignment for 16 exemplar insect sequences. This alignment is used as a template for a web server that automatically generates secondary structures for insect sequences.

  12. Monte Carlo modelling of photodynamic therapy treatments comparing clustered three dimensional tumour structures with homogeneous tissue structures

    NASA Astrophysics Data System (ADS)

    Campbell, C. L.; Wood, K.; Brown, C. T. A.; Moseley, H.

    2016-07-01

    We explore the effects of three dimensional (3D) tumour structures on depth dependent fluence rates, photodynamic doses (PDD) and fluorescence images through Monte Carlo radiation transfer modelling of photodynamic therapy. The aim with this work was to compare the commonly used uniform tumour densities with non-uniform densities to determine the importance of including 3D models in theoretical investigations. It was found that fractal 3D models resulted in deeper penetration on average of therapeutic radiation and higher PDD. An increase in effective treatment depth of 1 mm was observed for one of the investigated fractal structures, when comparing to the equivalent smooth model. Wide field fluorescence images were simulated, revealing information about the relationship between tumour structure and the appearance of the fluorescence intensity. Our models indicate that the 3D tumour structure strongly affects the spatial distribution of therapeutic light, the PDD and the wide field appearance of surface fluorescence images.

  13. Comparative structural analysis of eukaryotic flagella and cilia from Chlamydomonas, Tetrahymena, and sea urchins.

    PubMed

    Pigino, Gaia; Maheshwari, Aditi; Bui, Khanh Huy; Shingyoji, Chikako; Kamimura, Shinji; Ishikawa, Takashi

    2012-05-01

    Although eukaryotic flagella and cilia all share the basic 9+2 microtubule-organization of their internal axonemes, and are capable of generating bending-motion, the waveforms, amplitudes, and velocities of the bending-motions are quite diverse. To explore the structural basis of this functional diversity of flagella and cilia, we here compare the axonemal structure of three different organisms with widely divergent bending-motions by electron cryo-tomography. We reconstruct the 3D structure of the axoneme of Tetrahymena cilia, and compare it with the axoneme of the flagellum of sea urchin sperm, as well as with the axoneme of Chlamydomonas flagella, which we analyzed previously. This comparative structural analysis defines the diversity of molecular architectures in these organisms, and forms the basis for future correlation with their different bending-motions. PMID:22406282

  14. Comparative structural analysis of eukaryotic flagella and cilia from Chlamydomonas, Tetrahymena, and sea urchins.

    PubMed

    Pigino, Gaia; Maheshwari, Aditi; Bui, Khanh Huy; Shingyoji, Chikako; Kamimura, Shinji; Ishikawa, Takashi

    2012-05-01

    Although eukaryotic flagella and cilia all share the basic 9+2 microtubule-organization of their internal axonemes, and are capable of generating bending-motion, the waveforms, amplitudes, and velocities of the bending-motions are quite diverse. To explore the structural basis of this functional diversity of flagella and cilia, we here compare the axonemal structure of three different organisms with widely divergent bending-motions by electron cryo-tomography. We reconstruct the 3D structure of the axoneme of Tetrahymena cilia, and compare it with the axoneme of the flagellum of sea urchin sperm, as well as with the axoneme of Chlamydomonas flagella, which we analyzed previously. This comparative structural analysis defines the diversity of molecular architectures in these organisms, and forms the basis for future correlation with their different bending-motions.

  15. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers

    PubMed Central

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2016-01-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025

  16. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics.

  17. Thriving in multidisciplinary research: advice for new bioinformatics students.

    PubMed

    Auerbach, Raymond K

    2012-09-01

    The sciences have seen a large increase in demand for students in bioinformatics and multidisciplinary fields in general. Many new educational programs have been created to satisfy this demand, but navigating these programs requires a non-traditional outlook and emphasizes working in teams of individuals with distinct yet complementary skill sets. Written from the perspective of a current bioinformatics student, this article seeks to offer advice to prospective and current students in bioinformatics regarding what to expect in their educational program, how multidisciplinary fields differ from more traditional paths, and decisions that they will face on the road to becoming successful, productive bioinformaticists.

  18. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  19. Life comparative analysis of energy consumption and CO₂ emissions of different building structural frame types.

    PubMed

    Kim, Sangyong; Moon, Joon-Ho; Shin, Yoonseok; Kim, Gwang-Hee; Seo, Deok-Seok

    2013-01-01

    The objective of this research is to quantitatively measure and compare the environmental load and construction cost of different structural frame types. Construction cost also accounts for the costs of CO₂ emissions of input materials. The choice of structural frame type is a major consideration in construction, as this element represents about 33% of total building construction costs. In this research, four constructed buildings were analyzed, with these having either reinforced concrete (RC) or steel (S) structures. An input-output framework analysis was used to measure energy consumption and CO₂ emissions of input materials for each structural frame type. In addition, the CO₂ emissions cost was measured using the trading price of CO₂ emissions on the International Commodity Exchange. This research revealed that both energy consumption and CO₂ emissions were, on average, 26% lower with the RC structure than with the S structure, and the construction costs (including the CO₂ emissions cost) of the RC structure were about 9.8% lower, compared to the S structure. This research provides insights through which the construction industry will be able to respond to the carbon market, which is expected to continue to grow in the future.

  20. Evolution of web services in bioinformatics.

    PubMed

    Neerincx, Pieter B T; Leunissen, Jack A M

    2005-06-01

    Bioinformaticians have developed large collections of tools to make sense of the rapidly growing pool of molecular biological data. Biological systems tend to be complex and in order to understand them, it is often necessary to link many data sets and use more than one tool. Therefore, bioinformaticians have experimented with several strategies to try to integrate data sets and tools. Owing to the lack of standards for data sets and the interfaces of the tools this is not a trivial task. Over the past few years building services with web-based interfaces has become a popular way of sharing the data and tools that have resulted from many bioinformatics projects. This paper discusses the interoperability problem and how web services are being used to try to solve it, resulting in the evolution of tools with web interfaces from HTML/web form-based tools not suited for automatic workflow generation to a dynamic network of XML-based web services that can easily be used to create pipelines.

  1. Bioinformatic tools for microRNA dissection

    PubMed Central

    Akhtar, Most Mauluda; Micolucci, Luigina; Islam, Md Soriful; Olivieri, Fabiola; Procopio, Antonio Domenico

    2016-01-01

    Recently, microRNAs (miRNAs) have emerged as important elements of gene regulatory networks. MiRNAs are endogenous single-stranded non-coding RNAs (∼22-nt long) that regulate gene expression at the post-transcriptional level. Through pairing with mRNA, miRNAs can down-regulate gene expression by inhibiting translation or stimulating mRNA degradation. In some cases they can also up-regulate the expression of a target gene. MiRNAs influence a variety of cellular pathways that range from development to carcinogenesis. The involvement of miRNAs in several human diseases, particularly cancer, makes them potential diagnostic and prognostic biomarkers. Recent technological advances, especially high-throughput sequencing, have led to an exponential growth in the generation of miRNA-related data. A number of bioinformatic tools and databases have been devised to manage this growing body of data. We analyze 129 miRNA tools that are being used in diverse areas of miRNA research, to assist investigators in choosing the most appropriate tools for their needs. PMID:26578605

  2. Bacterial bioinformatics: pathogenesis and the genome.

    PubMed

    Paine, Kelly; Flower, Darren R

    2002-07-01

    As the number of completed microbial genome sequences continues to grow, there is a pressing need for the exploitation of this wealth of data through a synergistic interaction between the well-established science of bacteriology and the emergent discipline of bioinformatics. Antibiotic resistance and pathogenicity in virulent bacteria has become an increasing problem, with even the strongest drugs useless against some species, such as multi-drug resistant Enterococcus faecium and Mycobacterium tuberculosis. The global spread of Human Immunodeficiency Virus (HIV) and Acquired Immune Deficiency Syndrome (AIDS) has contributed to the re-emergence of tuberculosis and the threat from new and emergent diseases. To address these problems, bacterial pathogenicity requires redefinition as Koch's postulates become obsolete. This review discusses how the use of bacterial genomic information, and the in silico tools available at present, may aid in determining the definition of a current pathogen. The combination of both fields should provide a rapid and efficient way of assisting in the future development of antimicrobial therapies. PMID:12125816

  3. Bioinformatic tools for microRNA dissection.

    PubMed

    Akhtar, Most Mauluda; Micolucci, Luigina; Islam, Md Soriful; Olivieri, Fabiola; Procopio, Antonio Domenico

    2016-01-01

    Recently, microRNAs (miRNAs) have emerged as important elements of gene regulatory networks. MiRNAs are endogenous single-stranded non-coding RNAs (~22-nt long) that regulate gene expression at the post-transcriptional level. Through pairing with mRNA, miRNAs can down-regulate gene expression by inhibiting translation or stimulating mRNA degradation. In some cases they can also up-regulate the expression of a target gene. MiRNAs influence a variety of cellular pathways that range from development to carcinogenesis. The involvement of miRNAs in several human diseases, particularly cancer, makes them potential diagnostic and prognostic biomarkers. Recent technological advances, especially high-throughput sequencing, have led to an exponential growth in the generation of miRNA-related data. A number of bioinformatic tools and databases have been devised to manage this growing body of data. We analyze 129 miRNA tools that are being used in diverse areas of miRNA research, to assist investigators in choosing the most appropriate tools for their needs.

  4. Comparative analysis of seismic response characteristics of pile-soil-structure interaction system

    NASA Astrophysics Data System (ADS)

    Kong, Desen; Luan, Maotian; Wang, Weiming

    2006-01-01

    The study on the earthquake-resistant performance of a pile-soil-structure interaction system is a relatively complicated and primarily important issue in civil engineering practice. In this paper, a computational model and computation procedures for pile-supported structures, which can duly consider the pile-soil interaction effect, are established by the finite element method. Numerical implementation is made in the time domain. A simplified approximation for the seismic response analysis of pile-soil-structure systems is briefly presented. Then a comparative study is performed for an engineering example with numerical results computed respectively by the finite element method and the simplified method. Through comparative analysis, it is shown that the results obtained by the simplified method well agree with those achieved by the finite element method. The numerical results and findings will offer instructive guidelines for earthquake-resistant analysis and design of pile-supported structures.

  5. Bioinformatical approaches to characterize intrinsically disordered/unstructured proteins.

    PubMed

    Dosztányi, Zsuzsanna; Mészáros, Bálint; Simon, István

    2010-03-01

    Intrinsically disordered/unstructured proteins exist without a stable three-dimensional (3D) structure as highly flexible conformational ensembles. The available genome sequences revealed that these proteins are surprisingly common and their frequency reaches high proportions in eukaryotes. Due to their vital role in various biological processes including signaling and regulation and their involvement in various diseases, disordered proteins and protein segments are the focus of many biochemical, molecular biological, pathological and pharmaceutical studies. These proteins are difficult to study experimentally because of the lack of unique structure in the isolated form. Their amino acid sequence, however, is available, and can be used for their identification and characterization by bioinformatic tools, analogously to globular proteins. In this review, we first present a small survey of current methods to identify disordered proteins or protein segments, focusing on those that are publicly available as web servers. In more detail we also discuss approaches that predict disordered regions and specific regions involved in protein binding by modeling the physical background of protein disorder. In our review we argue that the heterogeneity of disordered segments needs to be taken into account for a better understanding of protein disorder.

  6. A comparative overview of modal testing and system identification for control of structures

    NASA Technical Reports Server (NTRS)

    Juang, J.-N.; Pappa, R. S.

    1988-01-01

    A comparative overview is presented of the disciplines of modal testing used in structural engineering and system identification used in control theory. A list of representative references from both areas is given, and the basic methods are described briefly. Recent progress on the interaction of modal testing and control disciplines is discussed. It is concluded that combined efforts of researchers in both disciplines are required for unification of modal testing and system identification methods for control of flexible structures.

  7. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  8. [Bioinformatic research of the family of PEX11, peroxisome proliferous factor in fungus].

    PubMed

    Zhang, Xin; Jiang, Hua; Wang, Yan-Li; Zhang, Zhen; Mao, Xue-Qin; Chai, Rong-Yao; Qiu, Hai-Ping; Du, Xin-Fa; Wang, Jiao-Yu; Sun, Guo-Chang

    2012-05-01

    The family members of PEX11 are key factors involved in regulation of peroxisome proliferation. Sixty-six PEX11p candidates of PEX11 gene family from 26 representative fungal species were obtained and analyzed by bioinformatic strategies. In most filamentous fungi, 2 or 3 potential PEX11ps were found, in contrast with 1 or 2 in yeast species. Compared with other fungal species, the Ascomycetes tend to have more PEX11ps, and even 5 in several individuals. The data of phylogenetic analysis and protein structure indicated that all of the PEX11ps were divided into 3 groups: I, II, and III. The members of group I and group III existed in most species, while those in group II were found only in Pezizomycotina. By MEME analysis, 5-6 conserved motifs were found in each PEX11ps. Among them,motif 8 in C-terminal had the most conservation, indicating that this motif probably plays a key role in maintaining the proper function of PEX11p.

  9. RNA:(guanine-N2) methyltransferases RsmC/RsmD and their homologs revisited – bioinformatic analysis and prediction of the active site based on the uncharacterized Mj0882 protein structure

    PubMed Central

    2002-01-01

    Background Escherichia coli guanine-N2 (m2G) methyltransferases (MTases) RsmC and RsmD modify nucleosides G1207 and G966 of 16S rRNA. They possess a common MTase domain in the C-terminus and a variable region in the N-terminus. Their C-terminal domain is related to the YbiN family of hypothetical MTases, but nothing is known about the structure or function of the N-terminal domain. Results Using a combination of sequence database searches and fold recognition methods it has been demonstrated that the N-termini of RsmC and RsmD are related to each other and that they represent a "degenerated" version of the C-terminal MTase domain. Novel members of the YbiN family from Archaea and Eukaryota were also indentified. It is inferred that YbiN and both domains of RsmC and RsmD are closely related to a family of putative MTases from Gram-positive bacteria and Archaea, typified by the Mj0882 protein from M. jannaschii (1dus in PDB). Based on the results of sequence analysis and structure prediction, the residues involved in cofactor binding, target recognition and catalysis were identified, and the mechanism of the guanine-N2 methyltransfer reaction was proposed. Conclusions Using the known Mj0882 structure, a comprehensive analysis of sequence-structure-function relationships in the family of genuine and putative m2G MTases was performed. The results provide novel insight into the mechanism of m2G methylation and will serve as a platform for experimental analysis of numerous uncharacterized N-MTases. PMID:11929612

  10. Can we integrate bioinformatics data on the Internet?

    PubMed

    Martin, A C

    2001-09-01

    The NETTAB (Network Tools and Applications in Biology) 2001 Workshop entitled 'CORBA and XML: towards a bioinformatics-integrated network environment' was held at the Advanced Biotechnology Centre, Genoa, Italy, 17-18 May 2001.

  11. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    PubMed Central

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  12. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    PubMed

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  13. Bioinformatics opportunities for identification and study of medicinal plants

    PubMed Central

    Sharma, Vivekanand

    2013-01-01

    Plants have been used as a source of medicine since historic times and several commercially important drugs are of plant-based origin. The traditional approach towards discovery of plant-based drugs often times involves significant amount of time and expenditure. These labor-intensive approaches have struggled to keep pace with the rapid development of high-throughput technologies. In the era of high volume, high-throughput data generation across the biosciences, bioinformatics plays a crucial role. This has generally been the case in the context of drug designing and discovery. However, there has been limited attention to date to the potential application of bioinformatics approaches that can leverage plant-based knowledge. Here, we review bioinformatics studies that have contributed to medicinal plants research. In particular, we highlight areas in medicinal plant research where the application of bioinformatics methodologies may result in quicker and potentially cost-effective leads toward finding plant-based remedies. PMID:22589384

  14. The potential of translational bioinformatics approaches for pharmacology research

    PubMed Central

    Li, Lang

    2015-01-01

    The field of bioinformatics has allowed the interpretation of massive amounts of biological data, ushering in the era of ‘omics’ to biomedical research. Its potential impact on pharmacology research is enormous and it has shown some emerging successes. A full realization of this potential, however, requires standardized data annotation for large health record databases and molecular data resources. Improved standardization will further stimulate the development of system pharmacology models, using translational bioinformatics methods. This new translational bioinformatics paradigm is highly complementary to current pharmacological research fields, such as personalized medicine, pharmacoepidemiology and drug discovery. In this review, I illustrate the application of transformational bioinformatics to research in numerous pharmacology subdisciplines. PMID:25753093

  15. A web services choreography scenario for interoperating bioinformatics applications

    PubMed Central

    de Knikker, Remko; Guo, Youjun; Li, Jin-long; Kwan, Albert KH; Yip, Kevin Y; Cheung, David W; Cheung, Kei-Hoi

    2004-01-01

    Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine-friendly, 3) they use a non-standard format for data input and output, 4) they do not exploit standards to define application interface and message exchange, and 5) existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD) that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH) category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates with these web

  16. Whale song analyses using bioinformatics sequence analysis approaches

    NASA Astrophysics Data System (ADS)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  17. [Bioinformatics studies on photosynthetic system genes in cyanobacteria and chloroplasts].

    PubMed

    Shi, Ding-Ji; Zhang, Chao; Li, Shi-Ming; Li, Ci-Shan; Zhang, Peng-Peng; Yang, Ming-Li

    2004-06-01

    This study compared homology of base sequences in genes encoding photosynthetic system proteins of cyanobacteria (Synechocystics sp. PCC6803, Nostoc sp. PCC7120) with these of chloroplasts (from Marchantia Polymorpha, Nicotiana tobacum, Oryza sativ, Euglena gracilis, Pinus thunbergii, Zea mays, Odentella sinesis, Cyanophora paradoxa, Porphyra purpurea and Arabidopsis thaliana) by BLAST method. While the gene sequence of Synechocystics sp. PCC6803 was considered as the criterion (100%) the homology of others were compared with it. Among the genes for photosystem I, psaC homology was the highest (90.14%) and the lowest was psaJ (52.24%). The highest ones were psbD (83.71%) for photosystem II, atpB (79.58%) for ATP synthase and petB (81.66%) for cytochrome b6/f complex. The lowest ones were psbN (49.70%) for photosystem II, atpF (26.69%) for ATP synthase and petA (55.27%) for cytochrome b6/f complex. Also, this paper discussed why the homology of gene sequences was the highest or the lowest. No report has been published and this bioinformatics research may provide some evidences for the origin and evolution of chloroplasts.

  18. E-Learning as a new tool in bioinformatics teaching.

    PubMed

    Saravanan, Vijayakumar; Shanmughavel, Piramanayagam

    2007-11-01

    In recent years, virtual learning is growing rapidly. Universities, colleges, and secondary schools are now delivering training and education over the internet. Beside this, resources available over the WWW are huge and understanding the various techniques employed in the field of Bioinformatics is increasingly complex for students during implementation. Here, we discuss its importance in developing and delivering an educational system in Bioinformatics based on e-learning environment.

  19. E-Learning as a new tool in bioinformatics teaching

    PubMed Central

    Saravanan, Vijayakumar; Shanmughavel, Piramanayagam

    2007-01-01

    In recent years, virtual learning is growing rapidly. Universities, colleges, and secondary schools are now delivering training and education over the internet. Beside this, resources available over the WWW are huge and understanding the various techniques employed in the field of Bioinformatics is increasingly complex for students during implementation. Here, we discuss its importance in developing and delivering an educational system in Bioinformatics based on e-learning environment. PMID:18292800

  20. The 2015 Bioinformatics Open Source Conference (BOSC 2015)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J. A.; Lapp, Hilmar

    2016-01-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included “Data Science;” “Standards and Interoperability;” “Open Science and Reproducibility;” “Translational Bioinformatics;” “Visualization;” and “Bioinformatics Open Source Project Updates”. In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled “Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community,” that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  1. The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis

    PubMed Central

    Rampp, Markus; Soddemann, Thomas; Lederer, Hermann

    2006-01-01

    We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’). PMID:16844980

  2. Expanding our understanding of sequence-function relationships of type II polyketide biosynthetic gene clusters: bioinformatics-guided identification of Frankiamicin A from Frankia sp. EAN1pec.

    PubMed

    Ogasawara, Yasushi; Yackley, Benjamin J; Greenberg, Jacob A; Rogelj, Snezna; Melançon, Charles E

    2015-01-01

    A large and rapidly increasing number of unstudied "orphan" natural product biosynthetic gene clusters are being uncovered in sequenced microbial genomes. An important goal of modern natural products research is to be able to accurately predict natural product structures and biosynthetic pathways from these gene cluster sequences. This requires both development of bioinformatic methods for global analysis of these gene clusters and experimental characterization of select products produced by gene clusters with divergent sequence characteristics. Here, we conduct global bioinformatic analysis of all available type II polyketide gene cluster sequences and identify a conserved set of gene clusters with unique ketosynthase α/β sequence characteristics in the genomes of Frankia species, a group of Actinobacteria with underexploited natural product biosynthetic potential. Through LC-MS profiling of extracts from several Frankia species grown under various conditions, we identified Frankia sp. EAN1pec as producing a compound with spectral characteristics consistent with the type II polyketide produced by this gene cluster. We isolated the compound, a pentangular polyketide which we named frankiamicin A, and elucidated its structure by NMR and labeled precursor feeding. We also propose biosynthetic and regulatory pathways for frankiamicin A based on comparative genomic analysis and literature precedent, and conduct bioactivity assays of the compound. Our findings provide new information linking this set of Frankia gene clusters with the compound they produce, and our approach has implications for accurate functional prediction of the many other type II polyketide clusters present in bacterial genomes. PMID:25837682

  3. Expanding our Understanding of Sequence-Function Relationships of Type II Polyketide Biosynthetic Gene Clusters: Bioinformatics-Guided Identification of Frankiamicin A from Frankia sp. EAN1pec

    PubMed Central

    Ogasawara, Yasushi; Yackley, Benjamin J.; Greenberg, Jacob A.; Rogelj, Snezna; Melançon, Charles E.

    2015-01-01

    A large and rapidly increasing number of unstudied “orphan” natural product biosynthetic gene clusters are being uncovered in sequenced microbial genomes. An important goal of modern natural products research is to be able to accurately predict natural product structures and biosynthetic pathways from these gene cluster sequences. This requires both development of bioinformatic methods for global analysis of these gene clusters and experimental characterization of select products produced by gene clusters with divergent sequence characteristics. Here, we conduct global bioinformatic analysis of all available type II polyketide gene cluster sequences and identify a conserved set of gene clusters with unique ketosynthase α/β sequence characteristics in the genomes of Frankia species, a group of Actinobacteria with underexploited natural product biosynthetic potential. Through LC-MS profiling of extracts from several Frankia species grown under various conditions, we identified Frankia sp. EAN1pec as producing a compound with spectral characteristics consistent with the type II polyketide produced by this gene cluster. We isolated the compound, a pentangular polyketide which we named frankiamicin A, and elucidated its structure by NMR and labeled precursor feeding. We also propose biosynthetic and regulatory pathways for frankiamicin A based on comparative genomic analysis and literature precedent, and conduct bioactivity assays of the compound. Our findings provide new information linking this set of Frankia gene clusters with the compound they produce, and our approach has implications for accurate functional prediction of the many other type II polyketide clusters present in bacterial genomes. PMID:25837682

  4. Gender Differences in Structured Risk Assessment: Comparing the Accuracy of Five Instruments

    ERIC Educational Resources Information Center

    Coid, Jeremy; Yang, Min; Ullrich, Simone; Zhang, Tianqiang; Sizmur, Steve; Roberts, Colin; Farrington, David P.; Rogers, Robert D.

    2009-01-01

    Structured risk assessment should guide clinical risk management, but it is uncertain which instrument has the highest predictive accuracy among men and women. In the present study, the authors compared the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991, 2003); the Historical, Clinical, Risk Management-20 (HCR-20; C. D. Webster, K. S.…

  5. Linguistic Structure and Non-linguistic Cognition: English and Russian Blues Compared.

    ERIC Educational Resources Information Center

    Laws, Glynis; And Others

    1995-01-01

    Investigates the influence of linguistic structure on non-linguistic cognition by comparing Russian and English behavior on tasks involving the color blue. Russians, who differentiate this region into "dark blue" and "light blue," were expected to separate blues more often than English subjects for whom the colors belong to one lexical category.…

  6. The Company That Words Keep: Comparing the Statistical Structure of Child- versus Adult-Directed Language

    ERIC Educational Resources Information Center

    Hills, Thomas

    2013-01-01

    Does child-directed language differ from adult-directed language in ways that might facilitate word learning? Associative structure (the probability that a word appears with its free associates), contextual diversity, word repetitions and frequency were compared longitudinally across six language corpora, with four corpora of language directed at…

  7. Comparing Religious Education in Canadian and Australian Catholic High Schools: Identifying Some Key Structural Issues

    ERIC Educational Resources Information Center

    Rymarz, Richard

    2013-01-01

    Religious education (RE) in Catholic high schools in Australia and Canada is compared by examining some of the underlying structural factors that shape the delivery of RE. It is argued that in Canadian Catholic schools RE is diminished by three factors that distinguish it from the Australian experience. These are: the level and history of…

  8. Estimating, Testing, and Comparing Specific Effects in Structural Equation Models: The Phantom Model Approach

    ERIC Educational Resources Information Center

    Macho, Siegfried; Ledermann, Thomas

    2011-01-01

    The phantom model approach for estimating, testing, and comparing specific effects within structural equation models (SEMs) is presented. The rationale underlying this novel method consists in representing the specific effect to be assessed as a total effect within a separate latent variable model, the phantom model that is added to the main…

  9. Comparative genomics reveals 104 candidate structured RNAs from bacteria, archaea, and their metagenomes

    PubMed Central

    2010-01-01

    Background Structured noncoding RNAs perform many functions that are essential for protein synthesis, RNA processing, and gene regulation. Structured RNAs can be detected by comparative genomics, in which homologous sequences are identified and inspected for mutations that conserve RNA secondary structure. Results By applying a comparative genomics-based approach to genome and metagenome sequences from bacteria and archaea, we identified 104 candidate structured RNAs and inferred putative functions for many of these. Twelve candidate metabolite-binding RNAs were identified, three of which were validated, including one reported herein that binds the coenzyme S-adenosylmethionine. Newly identified cis-regulatory RNAs are implicated in photosynthesis or nitrogen regulation in cyanobacteria, purine and one-carbon metabolism, stomach infection by Helicobacter, and many other physiological processes. A candidate riboswitch termed crcB is represented in both bacteria and archaea. Another RNA motif may control gene expression from 3'-untranslated regions of mRNAs, which is unusual for bacteria. Many noncoding RNAs that likely act in trans are also revealed, and several of the noncoding RNA candidates are found mostly or exclusively in metagenome DNA sequences. Conclusions This work greatly expands the variety of highly structured noncoding RNAs known to exist in bacteria and archaea and provides a starting point for biochemical and genetic studies needed to validate their biologic functions. Given the sustained rate of RNA discovery over several similar projects, we expect that far more structured RNAs remain to be discovered from bacterial and archaeal organisms. PMID:20230605

  10. ModeRNA: a tool for comparative modeling of RNA 3D structure

    PubMed Central

    Rother, Magdalena; Rother, Kristian; Puton, Tomasz; Bujnicki, Janusz M.

    2011-01-01

    RNA is a large group of functionally important biomacromolecules. In striking analogy to proteins, the function of RNA depends on its structure and dynamics, which in turn is encoded in the linear sequence. However, while there are numerous methods for computational prediction of protein three-dimensional (3D) structure from sequence, with comparative modeling being the most reliable approach, there are very few such methods for RNA. Here, we present ModeRNA, a software tool for comparative modeling of RNA 3D structures. As an input, ModeRNA requires a 3D structure of a template RNA molecule, and a sequence alignment between the target to be modeled and the template. It must be emphasized that a good alignment is required for successful modeling, and for large and complex RNA molecules the development of a good alignment usually requires manual adjustments of the input data based on previous expertise of the respective RNA family. ModeRNA can model post-transcriptional modifications, a functionally important feature analogous to post-translational modifications in proteins. ModeRNA can also model DNA structures or use them as templates. It is equipped with many functions for merging fragments of different nucleic acid structures into a single model and analyzing their geometry. Windows and UNIX implementations of ModeRNA with comprehensive documentation and a tutorial are freely available. PMID:21300639

  11. Structural complexity of DNA sequence.

    PubMed

    Liou, Cheng-Yuan; Tseng, Shen-Han; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  12. Structural Complexity of DNA Sequence

    PubMed Central

    Liou, Cheng-Yuan; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  13. Standardizing the next generation of bioinformatics software development with BioHDF (HDF5).

    PubMed

    Mason, Christopher E; Zumbo, Paul; Sanders, Stephan; Folk, Mike; Robinson, Dana; Aydt, Ruth; Gollery, Martin; Welsh, Mark; Olson, N Eric; Smith, Todd M

    2010-01-01

    Next Generation Sequencing technologies are limited by the lack of standard bioinformatics infrastructures that can reduce data storage, increase data processing performance, and integrate diverse information. HDF technologies address these requirements and have a long history of use in data-intensive science communities. They include general data file formats, libraries, and tools for working with the data. Compared to emerging standards, such as the SAM/BAM formats, HDF5-based systems demonstrate significantly better scalability, can support multiple indexes, store multiple data types, and are self-describing. For these reasons, HDF5 and its BioHDF extension are well suited for implementing data models to support the next generation of bioinformatics applications. PMID:20865556

  14. Proteomic and bioinformatic analyses of spinal cord injury-induced skeletal muscle atrophy in rats

    PubMed Central

    WEI, ZHI-JIAN; ZHOU, XIAN-HU; FAN, BAO-YOU; LIN, WEI; REN, YI-MING; FENG, SHI-QING

    2016-01-01

    Spinal cord injury (SCI) may result in skeletal muscle atrophy. Identifying diagnostic biomarkers and effective targets for treatment is an important challenge in clinical work. The aim of the present study is to elucidate potential biomarkers and therapeutic targets for SCI-induced muscle atrophy (SIMA) using proteomic and bioinformatic analyses. The protein samples from rat soleus muscle were collected at different time points following SCI injury and separated by two-dimensional gel electrophoresis and compared with the sham group. The identities of these protein spots were analyzed by mass spectrometry (MS). MS demonstrated that 20 proteins associated with muscle atrophy were differentially expressed. Bioinformatic analyses indicated that SIMA changed the expression of proteins associated with cellular, developmental, immune system and metabolic processes, biological adhesion and localization. The results of the present study may be beneficial in understanding the molecular mechanisms of SIMA and elucidating potential biomarkers and targets for the treatment of muscle atrophy. PMID:27177391

  15. [Bioinformatics analysis of the expansin gene family in rice].

    PubMed

    Shi, Yang; Xu, Xiao; Li, Haoyang; Xu, Qian; Xu, Jichen

    2014-08-01

    Expansin refers to a family of nonenzymatic proteins found in the plant cell wall with important roles in plant cell growth, developmental processes, and resistance to stress. Whole rice genome sequencing revealed that it contains 58 expansin genes, which belong to 4 subfamilies (A (34), B (19), LA (4) and LB (1)). All the genes were located on 10 of 12 rice chromosomes where several subfamily members clustered. Each of expansin genes ranged from 687 bp to 1128 bp in size. Sequence alignment showed that all expansins had three structural domains with two conserved amino acids of cystine in N-terminus and tryptophan in C-terminus. The amino acid identity of members among different subfamilies was less than 35%, while that among the same subfamily was more than 35%. Most genes of A subfamily had 1 or 2 introns, while genes of B, LA and LB subfamily had 3, 4 and 4 introns, respectively. Statistics analysis of codon usage showed that expansins in rice have 26 high-frequency codons which are more biased than those in other species. These bioinformatics findings will be helpful for the further study of the function and evolution of expansin genes.

  16. Databases, models, and algorithms for functional genomics: a bioinformatics perspective.

    PubMed

    Singh, Gautam B; Singh, Harkirat

    2005-02-01

    A variety of patterns have been observed on the DNA and protein sequences that serve as control points for gene expression and cellular functions. Owing to the vital role of such patterns discovered on biological sequences, they are generally cataloged and maintained within internationally shared databases. Furthermore,the variability in a family of observed patterns is often represented using computational models in order to facilitate their search within an uncharacterized biological sequence. As the biological data is comprised of a mosaic of sequence-levels motifs, it is significant to unravel the synergies of macromolecular coordination utilized in cell-specific differential synthesis of proteins. This article provides an overview of the various pattern representation methodologies and the surveys the pattern databases available for use to the molecular biologists. Our aim is to describe the principles behind the computational modeling and analysis techniques utilized in bioinformatics research, with the objective of providing insight necessary to better understand and effectively utilize the available databases and analysis tools. We also provide a detailed review of DNA sequence level patterns responsible for structural conformations within the Scaffold or Matrix Attachment Regions (S/MARs).

  17. Bioinformatic Characterization of Glycyl Radical Enzyme-Associated Bacterial Microcompartments

    PubMed Central

    Zarzycki, Jan; Erbilgin, Onur

    2015-01-01

    Bacterial microcompartments (BMCs) are proteinaceous organelles encapsulating enzymes that catalyze sequential reactions of metabolic pathways. BMCs are phylogenetically widespread; however, only a few BMCs have been experimentally characterized. Among them are the carboxysomes and the propanediol- and ethanolamine-utilizing microcompartments, which play diverse metabolic and ecological roles. The substrate of a BMC is defined by its signature enzyme. In catabolic BMCs, this enzyme typically generates an aldehyde. Recently, it was shown that the most prevalent signature enzymes encoded by BMC loci are glycyl radical enzymes, yet little is known about the function of these BMCs. Here we characterize the glycyl radical enzyme-associated microcompartment (GRM) loci using a combination of bioinformatic analyses and active-site and structural modeling to show that the GRMs comprise five subtypes. We predict distinct functions for the GRMs, including the degradation of choline, propanediol, and fuculose phosphate. This is the first family of BMCs for which identification of the signature enzyme is insufficient for predicting function. The distinct GRM functions are also reflected in differences in shell composition and apparently different assembly pathways. The GRMs are the counterparts of the vitamin B12-dependent propanediol- and ethanolamine-utilizing BMCs, which are frequently associated with virulence. This study provides a comprehensive foundation for experimental investigations of the diverse roles of GRMs. Understanding this plasticity of function within a single BMC family, including characterization of differences in permeability and assembly, can inform approaches to BMC bioengineering and the design of therapeutics. PMID:26407889

  18. Structures, properties, and functions of the stings of honey bees and paper wasps: a comparative study

    PubMed Central

    Zhao, Zi-Long; Zhao, Hong-Ping; Ma, Guo-Jun; Wu, Cheng-Wei; Yang, Kai; Feng, Xi-Qiao

    2015-01-01

    ABSTRACT Through natural selection, many animal organs with similar functions have evolved different macroscopic morphologies and microscopic structures. Here, we comparatively investigate the structures, properties and functions of honey bee stings and paper wasp stings. Their elegant structures were systematically observed. To examine their behaviors of penetrating into different materials, we performed penetration–extraction tests and slow motion analyses of their insertion process. In comparison, the barbed stings of honey bees are relatively difficult to be withdrawn from fibrous tissues (e.g. skin), while the removal of paper wasp stings is easier due to their different structures and insertion skills. The similarities and differences of the two kinds of stings are summarized on the basis of the experiments and observations. PMID:26002929

  19. Comparative Application of Capacity Models for Seismic Vulnerability Evaluation of Existing RC Structures

    SciTech Connect

    Faella, C.; Lima, C.; Martinelli, E.; Nigro, E.

    2008-07-08

    Seismic vulnerability assessment of existing buildings is one of the most common tasks in which Structural Engineers are currently engaged. Since, its is often a preliminary step to approach the issue of how to retrofit non-seismic designed and detailed structures, it plays a key role in the successful choice of the most suitable strengthening technique. In this framework, the basic information for both seismic assessment and retrofitting is related to the formulation of capacity models for structural members. Plenty of proposals, often contradictory under the quantitative standpoint, are currently available within the technical and scientific literature for defining the structural capacity in terms of force and displacements, possibly with reference to different parameters representing the seismic response. The present paper shortly reviews some of the models for capacity of RC members and compare them with reference to two case studies assumed as representative of a wide class of existing buildings.

  20. Analyzing gene expression profiles in dilated cardiomyopathy via bioinformatics methods

    PubMed Central

    Wang, Liming; Zhu, L.; Luan, R.; Wang, L.; Fu, J.; Wang, X.; Sui, L.

    2016-01-01

    Dilated cardiomyopathy (DCM) is characterized by ventricular dilatation, and it is a common cause of heart failure and cardiac transplantation. This study aimed to explore potential DCM-related genes and their underlying regulatory mechanism using methods of bioinformatics. The gene expression profiles of GSE3586 were downloaded from Gene Expression Omnibus database, including 15 normal samples and 13 DCM samples. The differentially expressed genes (DEGs) were identified between normal and DCM samples using Limma package in R language. Pathway enrichment analysis of DEGs was then performed. Meanwhile, the potential transcription factors (TFs) and microRNAs (miRNAs) of these DEGs were predicted based on their binding sequences. In addition, DEGs were mapped to the cMap database to find the potential small molecule drugs. A total of 4777 genes were identified as DEGs by comparing gene expression profiles between DCM and control samples. DEGs were significantly enriched in 26 pathways, such as lymphocyte TarBase pathway and androgen receptor signaling pathway. Furthermore, potential TFs (SP1, LEF1, and NFAT) were identified, as well as potential miRNAs (miR-9, miR-200 family, and miR-30 family). Additionally, small molecules like isoflupredone and trihexyphenidyl were found to be potential therapeutic drugs for DCM. The identified DEGs (PRSS12 and FOXG1), potential TFs, as well as potential miRNAs, might be involved in DCM. PMID:27737314

  1. AB Initio Protein Tertiary Structure Prediction: Comparative-Genetic Algorithm with Graph Theoretical Methods

    SciTech Connect

    Gregurick, S. K.

    2001-04-20

    During the period from September 1, 1998 until September 1, 2000 I was awarded a Sloan/DOE postdoctoral fellowship to work in collaboration with Professor John Moult at the Center for Advanced Research in Biotechnology (CARB). Our research project, ''Ab Initio Protein Tertiary Structure Prediction and a Comparative Genetic algorithm'', yielded promising initial results. In short, the project is designed to predict the native fold, or native tertiary structure, of a given protein by inputting only the primary sequence of the protein (one or three letter code). The algorithm is based on a general learning, or evolutionary algorithm and is called Genetic Algorithm (GAS). In our particular application of GAS, we search for native folds, or lowest energy structures, using two different descriptions for the interactions of the atoms and residues in a given protein sequence. One potential energy function is based on a free energy description, while the other function is a threading potential derived by Moult and Samudrala. This modified genetic algorithm was loosely termed a Comparative Genetic Algorithm and was designed to search for native folded structures on both potential energy surfaces, simultaneously. We tested the algorithm on a series of peptides ranging from 11 to 15 residues in length, which are thought to be independent folding units and thereby will fold to native structures independent of the larger protein environment. Our initial results indicated a modest increase in accuracy, as compared to a standard Genetic Algorithm. We are now in the process of improving the algorithm to increase the sensitivity to other inputs, such as secondary structure requirements. The project did not involve additional students and as of yet, the work has not been published.

  2. A comparative study of Whi5 and retinoblastoma proteins: from sequence and structure analysis to intracellular networks

    PubMed Central

    Hasan, Md Mehedi; Brocca, Stefania; Sacco, Elena; Spinelli, Michela; Papaleo, Elena; Lambrughi, Matteo; Alberghina, Lilia; Vanoni, Marco

    2014-01-01

    Cell growth and proliferation require a complex series of tight-regulated and well-orchestrated events. Accordingly, proteins governing such events are evolutionary conserved, even among distant organisms. By contrast, it is more singular the case of “core functions” exerted by functional analogous proteins that are not homologous and do not share any kind of structural similarity. This is the case of proteins regulating the G1/S transition in higher eukaryotes–i.e., the retinoblastoma (Rb) tumor suppressor Rb—and budding yeast, i.e., Whi5. The interaction landscape of Rb and Whi5 is quite large, with more than one hundred proteins interacting either genetically or physically with each protein. The Whi5 interactome has been used to construct a concept map of Whi5 function and regulation. Comparison of physical and genetic interactors of Rb and Whi5 allows highlighting a significant core of conserved, common functionalities associated with the interactors indicating that structure and function of the network—rather than individual proteins—are conserved during evolution. A combined bioinformatics and biochemical approach has shown that the whole Whi5 protein is highly disordered, except for a small region containing the protein family signature. The comparison with Whi5 homologs from Saccharomycetales has prompted the hypothesis of a modular organization of structural disorder, with most evolutionary conserved regions alternating with highly variable ones. The finding of a consensus sequence points to the conservation of a specific phosphorylation rhythm along with two disordered sequence motifs, probably acting as phosphorylation-dependent seeds in Whi5 folding/unfolding. Thus, the widely disordered Whi5 appears to act as a hierarchical, “date hub” that has evolutionary assayed an original way of modular organization before being supplanted by the globular, multi-domain structured Rb, more suitable to cover the role of a “party hub”. PMID

  3. Molecular docking of Glycine max and Medicago truncatula ureases with urea; bioinformatics approaches.

    PubMed

    Filiz, Ertugrul; Vatansever, Recep; Ozyigit, Ibrahim Ilker

    2016-03-01

    Urease (EC 3.5.1.5) is a nickel-dependent metalloenzyme catalyzing the hydrolysis of urea into ammonia and carbon dioxide. It is present in many bacteria, fungi, yeasts and plants. Most species, with few exceptions, use nickel metalloenzyme urease to hydrolyze urea, which is one of the commonly used nitrogen fertilizer in plant growth thus its enzymatic hydrolysis possesses vital importance in agricultural practices. Considering the essentiality and importance of urea and urease activity in most plants, this study aimed to comparatively investigate the ureases of two important legume species such as Glycine max (soybean) and Medicago truncatula (barrel medic) from Fabaceae family. With additional plant species, primary and secondary structures of 37 plant ureases were comparatively analyzed using various bioinformatics tools. A structure based phylogeny was constructed using predicted 3D models of G. max and M. truncatula, whose crystallographic structures are not available, along with three additional solved urease structures from Canavalia ensiformis (PDB: 4GY7), Bacillus pasteurii (PDB: 4UBP) and Klebsiella aerogenes (PDB: 1FWJ). In addition, urease structures of these species were docked with urea to analyze the binding affinities, interacting amino acids and atom distances in urease-urea complexes. Furthermore, mutable amino acids which could potentially affect the protein active site, stability and flexibility as well as overall protein stability were analyzed in urease structures of G. max and M. truncatula. Plant ureases demonstrated similar physico-chemical properties with 833-878 amino acid residues and 89.39-90.91 kDa molecular weight with mainly acidic (5.15-6.10 pI) nature. Four protein domain structures such as urease gamma, urease beta, urease alpha and amidohydro 1 characterized the plant ureases. Secondary structure of plant ureases also demonstrated conserved protein architecture, with predominantly α-helix and random coil structures. In

  4. Bioinformatics analysis of circulating cell-free DNA sequencing data.

    PubMed

    Chan, Landon L; Jiang, Peiyong

    2015-10-01

    The discovery of cell-free DNA molecules in plasma has opened up numerous opportunities in noninvasive diagnosis. Cell-free DNA molecules have become increasingly recognized as promising biomarkers for detection and management of many diseases. The advent of next generation sequencing has provided unprecedented opportunities to scrutinize the characteristics of cell-free DNA molecules in plasma in a genome-wide fashion and at single-base resolution. Consequently, clinical applications of circulating cell-free DNA analysis have not only revolutionized noninvasive prenatal diagnosis but also facilitated cancer detection and monitoring toward an era of blood-based personalized medicine. With the remarkably increasing throughput and lowering cost of next generation sequencing, bioinformatics analysis becomes increasingly demanding to understand the large amount of data generated by these sequencing platforms. In this Review, we highlight the major bioinformatics algorithms involved in the analysis of cell-free DNA sequencing data. Firstly, we briefly describe the biological properties of these molecules and provide an overview of the general bioinformatics approach for the analysis of cell-free DNA. Then, we discuss the specific upstream bioinformatics considerations concerning the analysis of sequencing data of circulating cell-free DNA, followed by further detailed elaboration on each key clinical situation in noninvasive prenatal diagnosis and cancer management where downstream bioinformatics analysis is heavily involved. We also discuss bioinformatics analysis as well as clinical applications of the newly developed massively parallel bisulfite sequencing of cell-free DNA. Finally, we offer our perspectives on the future development of bioinformatics in noninvasive diagnosis.

  5. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics.

    PubMed

    de Matos, Paula; Cham, Jennifer A; Cao, Hong; Alcántara, Rafael; Rowland, Francis; Lopez, Rodrigo; Steinbeck, Christoph

    2013-03-20

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users' requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios.For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature.We employed several UCD techniques, including: persona development, interviews, 'canvas sort' card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience.

  6. A comparative study of electronic structure and bonding in transition metal monocarbides

    NASA Astrophysics Data System (ADS)

    Soni, Pooja; Pagare, Gitanjali; Sanyal, Sankar P.; Rajagopalan, M.

    2012-07-01

    The structural, electronic, elastic and bonding properties of four transition metal carbides, ScC, YC (group III), VC and NbC (group V), have been investigated systematically using the first principles density functional theory (DFT). The full potential linearized augmented plane wave (FP-LAPW) method with the generalized gradient approximation (GGA) for the exchange correlation has been used for the calculation of the total energy. The ground state properties, such as equilibrium lattice constant, bulk modulus, are computed and compared with theoretical and experimental data. The electronic and bonding patterns of the two groups of compounds have been analyzed quantitatively and compared with the available data. It is clear from band structures that all the four transition metal monocarbides are metallic in nature. Analysis of elastic constants reveals that the carbides of group III are ductile in nature while those of group V are brittle.

  7. The leader peptide of mutacin 1140 has distinct structural components compared to related class I lantibiotics.

    PubMed

    Escano, Jerome; Stauffer, Byron; Brennan, Jacob; Bullock, Monica; Smith, Leif

    2014-12-01

    Lantibiotics are ribosomally synthesized peptide antibiotics composed of an N-terminal leader peptide that promotes the core peptide's interaction with the post translational modification (PTM) enzymes. Following PTMs, mutacin 1140 is transported out of the cell and the leader peptide is cleaved to yield the antibacterial peptide. Mutacin 1140 leader peptide is structurally unique compared to other class I lantibiotic leader peptides. Herein, we further our understanding of the structural differences of mutacin 1140 leader peptide with regard to other class I leader peptides. We have determined that the length of the leader peptide is important for the biosynthesis of mutacin 1140. We have also determined that mutacin 1140 leader peptide contains a novel four amino acid motif compared to related lantibiotics. PTM enzyme recognition of the leader peptide appears to be evolutionarily distinct from related class I lantibiotics. Our study on mutacin 1140 leader peptide provides a basis for future studies aimed at understanding its interaction with the PTM enzymes.

  8. Credibility Analysis of Putative Disease-Causing Genes Using Bioinformatics

    PubMed Central

    Abel, Olubunmi; Powell, John F.; Andersen, Peter M.; Al-Chalabi, Ammar

    2013-01-01

    Background Genetic studies are challenging in many complex diseases, particularly those with limited diagnostic certainty, low prevalence or of old age. The result is that genes may be reported as disease-causing with varying levels of evidence, and in some cases, the data may be so limited as to be indistinguishable from chance findings. When there are large numbers of such genes, an objective method for ranking the evidence is useful. Using the neurodegenerative and complex disease amyotrophic lateral sclerosis (ALS) as a model, and the disease-specific database ALSoD, the objective is to develop a method using publicly available data to generate a credibility score for putative disease-causing genes. Methods Genes with at least one publication suggesting involvement in adult onset familial ALS were collated following an exhaustive literature search. SQL was used to generate a score by extracting information from the publications and combined with a pathogenicity analysis using bioinformatics tools. The resulting score allowed us to rank genes in order of credibility. To validate the method, we compared the objective ranking with a rank generated by ALS genetics experts. Spearman's Rho was used to compare rankings generated by the different methods. Results The automated method ranked ALS genes in the following order: SOD1, TARDBP, FUS, ANG, SPG11, NEFH, OPTN, ALS2, SETX, FIG4, VAPB, DCTN1, TAF15, VCP, DAO. This compared very well to the ranking of ALS genetics experts, with Spearman's Rho of 0.69 (P = 0.009). Conclusion We have presented an automated method for scoring the level of evidence for a gene being disease-causing. In developing the method we have used the model disease ALS, but it could equally be applied to any disease in which there is genotypic uncertainty. PMID:23755159

  9. Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads.

    PubMed

    Moncunill, Valentí; Gonzalez, Santi; Beà, Sílvia; Andrieux, Lise O; Salaverria, Itziar; Royo, Cristina; Martinez, Laura; Puiggròs, Montserrat; Segura-Wang, Maia; Stütz, Adrian M; Navarro, Alba; Royo, Romina; Gelpí, Josep L; Gut, Ivo G; López-Otín, Carlos; Orozco, Modesto; Korbel, Jan O; Campo, Elias; Puente, Xose S; Torrents, David

    2014-11-01

    The development of high-throughput sequencing technologies has advanced our understanding of cancer. However, characterizing somatic structural variants in tumor genomes is still challenging because current strategies depend on the initial alignment of reads to a reference genome. Here, we describe SMUFIN (somatic mutation finder), a single program that directly compares sequence reads from normal and tumor genomes to accurately identify and characterize a range of somatic sequence variation, from single-nucleotide variants (SNV) to large structural variants at base pair resolution. Performance tests on modeled tumor genomes showed average sensitivity of 92% and 74% for SNVs and structural variants, with specificities of 95% and 91%, respectively. Analyses of aggressive forms of solid and hematological tumors revealed that SMUFIN identifies breakpoints associated with chromothripsis and chromoplexy with high specificity. SMUFIN provides an integrated solution for the accurate, fast and comprehensive characterization of somatic sequence variation in cancer. PMID:25344728

  10. Comparing the factor structure of the Wisconsin Schizotypy Scales and the Schizotypal Personality Questionnaire.

    PubMed

    Gross, Georgina M; Mellin, Juliann; Silvia, Paul J; Barrantes-Vidal, Neus; Kwapil, Thomas R

    2014-10-01

    Schizotypy is a multidimensional construct that captures the expression of schizophrenic symptoms and impairment from subclinical levels to full-blown psychosis. The present study examined the comparability of the factor structure of 2 leading psychometric measures of schizotypy: the Wisconsin Schizotypy Scales (WSS) and the Schizotypal Personality Questionnaire (SPQ). Both the SPQ and WSS purportedly capture the multidimensional structure of schizotypy; however, whether they are measuring comparable factors has not been empirically demonstrated. This study provided support for a 2-factor model with positive and negative factors underlying the WSS; however, contrary to previous findings, the best fit for the SPQ was for a 4-factor model using confirmatory factor analysis, and a 2-factor model using exploratory factor analysis. The WSS factors were relatively distinct, whereas those underlying the SPQ showed high overlap. The WSS positive and SPQ cognitive-perceptual factors appeared to tap comparable constructs. However, the WSS negative and SPQ interpersonal factors appeared to tap somewhat different constructs based on their correlation and their patterns of associations with other schizotypy dimensions and the Five-Factor Model-suggesting that the SPQ interpersonal factor may not adequately tap negative or deficit schizotypy. Although the SPQ offers the advantage over the WSS of having a disorganization factor, it is not clear that this SPQ factor is actually distinct from positive schizotypy. Existing measures should be used with caution and new measures based on a priori theories are necessary to further understand the factor structure of schizotypy.

  11. Population genetic structure of economically important Tortricidae (Lepidoptera) in South Africa: a comparative analysis.

    PubMed

    Timm, A E; Geertsema, H; Warnich, L

    2010-08-01

    Comparative studies of the population genetic structures of agricultural pests can elucidate the factors by which their population levels are affected, which is useful for designing pest management programs. This approach was used to provide insight into the six Tortricidae of major economic importance in South Africa. The population genetic structure of the carnation worm E. acerbella and the false codling moth T. leucotreta, analyzed using amplified fragment length polymorphism (AFLP) analysis, is presented here for the first time. These results were compared with those obtained previously for the codling moth Cydia pomonella, the oriental fruit moth Grapholita molesta, the litchi moth Cryptophlebia peltastica and the macadamia nut borer T. batrachopa. Locally adapted populations were detected over local geographic areas for all species. No significant differences were found among population genetic structures as result of population history (whether native or introduced) although host range (whether oligophagous or polyphagous) had a small but significant effect. It is concluded that factors such as dispersal ability and agricultural practices have the most important effects on genetically structuring populations of the economically important Tortricidae in South Africa.

  12. Comparative 3D genome structure analysis of the fission and the budding yeast.

    PubMed

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species.

  13. ModBase, a database of annotated comparative protein structure models, and associated resources.

    PubMed

    Pieper, Ursula; Webb, Benjamin M; Barkan, David T; Schneidman-Duhovny, Dina; Schlessinger, Avner; Braberg, Hannes; Yang, Zheng; Meng, Elaine C; Pettersen, Eric F; Huang, Conrad C; Datta, Ruchira S; Sampathkumar, Parthasarathy; Madhusudhan, Mallur S; Sjölander, Kimmen; Ferrin, Thomas E; Burley, Stephen K; Sali, Andrej

    2011-01-01

    ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains 10,355,444 reliable models for domains in 2,421,920 unique protein sequences. ModBase allows users to update comparative models on demand, and request modeling of additional sequences through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are available through the ModBase interface as well as the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the SALIGN server for multiple sequence and structure alignment (http://salilab.org/salign), the ModEval server for predicting the accuracy of protein structure models (http://salilab.org/modeval), the PCSS server for predicting which peptides bind to a given protein (http://salilab.org/pcss) and the FoXS server for calculating and fitting Small Angle X-ray Scattering profiles (http://salilab.org/foxs). PMID:21097780

  14. Population genetic structure of economically important Tortricidae (Lepidoptera) in South Africa: a comparative analysis.

    PubMed

    Timm, A E; Geertsema, H; Warnich, L

    2010-08-01

    Comparative studies of the population genetic structures of agricultural pests can elucidate the factors by which their population levels are affected, which is useful for designing pest management programs. This approach was used to provide insight into the six Tortricidae of major economic importance in South Africa. The population genetic structure of the carnation worm E. acerbella and the false codling moth T. leucotreta, analyzed using amplified fragment length polymorphism (AFLP) analysis, is presented here for the first time. These results were compared with those obtained previously for the codling moth Cydia pomonella, the oriental fruit moth Grapholita molesta, the litchi moth Cryptophlebia peltastica and the macadamia nut borer T. batrachopa. Locally adapted populations were detected over local geographic areas for all species. No significant differences were found among population genetic structures as result of population history (whether native or introduced) although host range (whether oligophagous or polyphagous) had a small but significant effect. It is concluded that factors such as dispersal ability and agricultural practices have the most important effects on genetically structuring populations of the economically important Tortricidae in South Africa. PMID:19941674

  15. Vignettes: diverse library staff offering diverse bioinformatics services*

    PubMed Central

    Osterbur, David L.; Alpi, Kristine; Canevari, Catharine; Corley, Pamela M.; Devare, Medha; Gaedeke, Nicola; Jacobs, Donna K.; Kirlew, Peter; Ohles, Janet A.; Vaughan, K.T.L.; Wang, Lili; Wu, Yongchun; Geer, Renata C.

    2006-01-01

    Objectives: The paper gives examples of the bioinformatics services provided in a variety of different libraries by librarians with a broad range of educational background and training. Methods: Two investigators sent an email inquiry to attendees of the “National Center for Biotechnology Information's (NCBI) Introduction to Molecular Biology Information Resources” or “NCBI Advanced Workshop for Bioinformatics Information Specialists (NAWBIS)” courses. The thirty-five-item questionnaire addressed areas such as educational background, library setting, types and numbers of users served, and bioinformatics training and support services provided. Answers were compiled into program vignettes. Discussion: The bioinformatics support services addressed in the paper are based in libraries with academic and clinical settings. Services have been established through different means: in collaboration with biology faculty as part of formal courses, through teaching workshops in the library, through one-on-one consultations, and by other methods. Librarians with backgrounds from art history to doctoral degrees in genetics have worked to establish these programs. Conclusion: Successful bioinformatics support programs can be established in libraries in a variety of different settings and by staff with a variety of different backgrounds and approaches. PMID:16888664

  16. An agent-based multilayer architecture for bioinformatics grids.

    PubMed

    Bartocci, Ezio; Cacciagrano, Diletta; Cannata, Nicola; Corradini, Flavio; Merelli, Emanuela; Milanesi, Luciano; Romano, Paolo

    2007-06-01

    Due to the huge volume and complexity of biological data available today, a fundamental component of biomedical research is now in silico analysis. This includes modelling and simulation of biological systems and processes, as well as automated bioinformatics analysis of high-throughput data. The quest for bioinformatics resources (including databases, tools, and knowledge) becomes therefore of extreme importance. Bioinformatics itself is in rapid evolution and dedicated Grid cyberinfrastructures already offer easier access and sharing of resources. Furthermore, the concept of the Grid is progressively interleaving with those of Web Services, semantics, and software agents. Agent-based systems can play a key role in learning, planning, interaction, and coordination. Agents constitute also a natural paradigm to engineer simulations of complex systems like the molecular ones. We present here an agent-based, multilayer architecture for bioinformatics Grids. It is intended to support both the execution of complex in silico experiments and the simulation of biological systems. In the architecture a pivotal role is assigned to an "alive" semantic index of resources, which is also expected to facilitate users' awareness of the bioinformatics domain.

  17. Derivation of rules for comparative protein modeling from a database of protein structure alignments.

    PubMed Central

    Sali, A.; Overington, J. P.

    1994-01-01

    We describe a database of protein structure alignments as well as methods and tools that use this database to improve comparative protein modeling. The current version of the database contains 105 alignments of similar proteins or protein segments. The database comprises 416 entries, 78,495 residues, 1,233 equivalent entry pairs, and 230,396 pairs of equivalent alignment positions. At present, the main application of the database is to improve comparative modeling by satisfaction of spatial restraints implemented in the program MODELLER (Sali A, Blundell TL, 1993, J Mol Biol 234:779-815). To illustrate the usefulness of the database, the restraints on the conformation of a disulfide bridge provided by an equivalent disulfide bridge in a related structure are derived from the alignments; the prediction success of the disulfide dihedral angle classes is increased to approximately 80%, compared to approximately 55% for modeling that relies on the stereochemistry of disulfide bridges alone. The second example of the use of the database is the derivation of the probability density function for comparative modeling of the cis/trans isomerism of the proline residues; the prediction success is increased from 0% to 82.9% for cis-proline and from 93.3% to 96.2% for trans-proline. The database is available via electronic mail. PMID:7833817

  18. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc.

  19. Effect of vegetation structure on subcanopy solar radiation: a comparative study

    NASA Astrophysics Data System (ADS)

    Anand, A.; Dubayah, R.; Hofton, M. A.

    2012-12-01

    Vertical structure of vegetation canopy influences spatial variability of radiation regime under forest canopies. A comparison of transmittance profiles and subcanopy radiation regime for two structurally different forest sites is done based on ray tracing and principles of radiative transfer using Lidar data. Medium footprint waveform Lidar data from Laser Vegetation Imaging Sensor (LVIS) was collected from the sites in Sierra National Forest (SNF), California and Smithsonian Environmental Research Center (SERC), Maryland in 2008 and 2003 respectively. Sites in both forest areas have varying vegetation structure with SNF sites representing mixed conifers whereas the sites in SERC represent eastern broadleaf trees. The Lidar waveform is processed to derive canopy gap probability as a function of height which is used to derive transmittance profiles and solar radiation as a function of canopy height using a 3-D light transmittance model. Geostatistics is applied to compare how the vertical and horizontal distribution of solar radiation under sub-canopy surface varies with varying vertical canopy structures such as foliage density, canopy cover and canopy height. This comparison is expected to increase knowledge on vegetation structure effects forest canopies.

  20. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc. PMID:27353064

  1. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    PubMed

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  2. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis

    PubMed Central

    Noar, Roslyn D.; Daub, Margaret E.

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  3. Comparative evaluation of structured oil systems: Shellac oleogel, HPMC oleogel, and HIPE gel

    PubMed Central

    Patel, Ashok R; Dewettinck, Koen

    2015-01-01

    In lipid-based food products, fat crystals are used as building blocks for creating a crystalline network that can trap liquid oil into a 3D gel-like structure which in turn is responsible for the desirable mouth feel and texture properties of the food products. However, the recent ban on the use of trans-fat in the US, coupled with the increasing concerns about the negative health effects of saturated fat consumption, has resulted in an increased interest in the area of identifying alternative ways of structuring edible oils using non-fat-based building blocks. In this paper, we give a brief account of three alternative approaches where oil structuring was carried out using wax crystals (shellac), polymer strands (hydrophilic cellulose derivative), and emulsion droplets as structurants. These building blocks resulted in three different types of oleogels that showed distinct rheological properties and temperature functionalities. The three approaches are compared in terms of the preparation process (ease of processing), properties of the formed systems (microstructure, rheological gel strength, temperature response, effect of water incorporation, and thixotropic recovery), functionality, and associated limitations of the structured systems. The comparative evaluation is made such that the new researchers starting their work in the area of oil structuring can use this discussion as a general guideline. Practical applications Various aspects of oil binding for three different building blocks were studied in this work. The practical significance of this study includes (i) information on the preparation process and the concentrations of structuring agents required for efficient gelation and (ii) information on the behavior of oleogels to temperature, applied shear, and presence of water. This information can be very useful for selecting the type of structuring agents keeping the final applications in mind. For detailed information on the actual edible applications

  4. Developing expertise in bioinformatics for biomedical research in Africa

    PubMed Central

    Karikari, Thomas K.; Quansah, Emmanuel; Mohamed, Wael M.Y.

    2015-01-01

    Research in bioinformatics has a central role in helping to advance biomedical research. However, its introduction to Africa has been met with some challenges (such as inadequate infrastructure, training opportunities, research funding, human resources, biorepositories and databases) that have contributed to the slow pace of development in this field across the continent. Fortunately, recent improvements in areas such as research funding, infrastructural support and capacity building are helping to develop bioinformatics into an important discipline in Africa. These contributions are leading to the establishment of world-class research facilities, biorepositories, training programmes, scientific networks and funding schemes to improve studies into disease and health in Africa. With increased contribution from all stakeholders, these developments could be further enhanced. Here, we discuss how the recent developments are contributing to the advancement of bioinformatics in Africa. PMID:26767162

  5. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    PubMed

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  6. Embracing the Future: Bioinformatics for High School Women

    NASA Astrophysics Data System (ADS)

    Zales, Charlotte Rappe; Cronin, Susan J.

    Sixteen high school women participated in a 5-week residential summer program designed to encourage female and minority students to choose careers in scientific fields. Students gained expertise in bioinformatics through problem-based learning in a complex learning environment of content instruction, speakers, labs, and trips. Innovative hands-on activities filled the program. Students learned biological principles in context and sophisticated bioinformatics tools for processing data. Students additionally mastered a variety of information-searching techniques. Students completed creative individual and group projects, demonstrating the successful integration of biology, information technology, and bioinformatics. Discussions with female scientists allowed students to see themselves in similar roles. Summer residential aspects fostered an atmosphere in which students matured in interacting with others and in their views of diversity.

  7. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    PubMed

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  8. Overview of commonly used bioinformatics methods and their applications.

    PubMed

    Kapetanovic, Izet M; Rosenfeld, Simon; Izmirlian, Grant

    2004-05-01

    Bioinformatics, in its broad sense, involves application of computer processes to solve biological problems. A wide range of computational tools are needed to effectively and efficiently process large amounts of data being generated as a result of recent technological innovations in biology and medicine. A number of computational tools have been developed or adapted to deal with the experimental riches of complex and multivariate data and transition from data collection to information or knowledge. These include a wide variety of clustering and classification algorithms, including self-organized maps (SOM), artificial neural networks (ANN), support vector machines (SVM), fuzzy logic, and even hyphenated techniques as neuro-fuzzy networks. These bioinformatics tools are being evaluated and applied in various medical areas including early detection, risk assessment, classification, and prognosis of cancer. The goal of these efforts is to develop and identify bioinformatics methods with optimal sensitivity, specificity, and predictive capabilities.

  9. An Analytic Comparison of Educational Systems: Overview of Purposes, Policies, Structures and Outcomes. Comparative Overview/Comparative Assessment.

    ERIC Educational Resources Information Center

    Hurn, Christopher J.; Burn, Barbara B.

    This comparative evaluation of the differing educational systems in North America, Europe, the USSR, and Japan examines the goals and values of these systems. It is pointed out that Americans value equality, practicality, and utility and that they are both individualistic and suspicious of government authority. Contrasts between these values and…

  10. Sequence database versioning for command line and Galaxy bioinformatics servers

    PubMed Central

    Dooley, Damion M.; Petkau, Aaron J.; Van Domselaar, Gary; Hsiao, William W.L.

    2016-01-01

    Motivation: There are various reasons for rerunning bioinformatics tools and pipelines on sequencing data, including reproducing a past result, validation of a new tool or workflow using a known dataset, or tracking the impact of database changes. For identical results to be achieved, regularly updated reference sequence databases must be versioned and archived. Database administrators have tried to fill the requirements by supplying users with one-off versions of databases, but these are time consuming to set up and are inconsistent across resources. Disk storage and data backup performance has also discouraged maintaining multiple versions of databases since databases such as NCBI nr can consume 50 Gb or more disk space per version, with growth rates that parallel Moore's law. Results: Our end-to-end solution combines our own Kipper software package—a simple key-value large file versioning system—with BioMAJ (software for downloading sequence databases), and Galaxy (a web-based bioinformatics data processing platform). Available versions of databases can be recalled and used by command-line and Galaxy users. The Kipper data store format makes publishing curated FASTA databases convenient since in most cases it can store a range of versions into a file marginally larger than the size of the latest version. Availability and implementation: Kipper v1.0.0 and the Galaxy Versioned Data tool are written in Python and released as free and open source software available at https://github.com/Public-Health-Bioinformatics/kipper and https://github.com/Public-Health-Bioinformatics/versioned_data, respectively; detailed setup instructions can be found at https://github.com/Public-Health-Bioinformatics/versioned_data/blob/master/doc/setup.md Contact: Damion.Dooley@Bccdc.Ca or William.Hsiao@Bccdc.Ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26656932

  11. A comparative study of current and magnetic structures of Weibel and filamentation instabilities

    NASA Astrophysics Data System (ADS)

    Huynh, Cong Tuan; Ryu, Chang-Mo

    2014-09-01

    A comparative study of the Weibel instability (WI) driven by anisotropic temperature and the Filamentation instability (FI) by counterstreaming plasmas are made by using a 2D Particle-in-cell code. Under the comparable initial conditions, the linear growth rates of the WI and the FI are almost the same as the theory predicts, but in the nonlinear phase, the maximum and nonlinearly saturated magnetic fields generated by the WI are always smaller than those generated by the FI. It is noted that in the initial linear growth phase, the WI and the FI both have center-filled currents, but in the nonlinear phase, the WI and the FI develop different types of current structures such that the WI maintains a center-filled current structure, whereas the FI develops a hollow current structure. Significant particle acceleration around the drift velocity is observed for the FI, whereas it is almost absent in the WI, which indicates that the enhanced velocity of the electron by particle acceleration is related to the hollow current production in the FI.

  12. Comparability of a three-dimensional structure in biopharmaceuticals using spectroscopic methods.

    PubMed

    Pérez Medina Martínez, Víctor; Abad-Javier, Mario E; Romero-Díaz, Alexis J; Villaseñor-Ortega, Francisco; Pérez, Néstor O; Flores-Ortiz, Luis F; Medina-Rivero, Emilio

    2014-01-01

    Protein structure depends on weak interactions and covalent bonds, like disulfide bridges, established according to the environmental conditions. Here, we present the validation of two spectroscopic methodologies for the measurement of free and unoxidized thiols, as an attribute of structural integrity, using 5,5'-dithionitrobenzoic acid (DTNB) and DyLight Maleimide (DLM) as derivatizing agents. These methods were used to compare Rituximab and Etanercept products from different manufacturers. Physicochemical comparability was demonstrated for Rituximab products as DTNB showed no statistical differences under native, denaturing, and denaturing-reducing conditions, with Student's t-test P values of 0.6233, 0.4022, and 0.1475, respectively. While for Etanercept products no statistical differences were observed under native (P = 0.0758) and denaturing conditions (P = 0.2450), denaturing-reducing conditions revealed cysteine contents of 98% and 101%, towards the theoretical value of 58, for the evaluated products from different Etanercept manufacturers. DLM supported equality between Rituximab products under native (P = 0.7499) and denaturing conditions (P = 0.8027), but showed statistical differences among Etanercept products under native conditions (P < 0.001). DLM suggested that Infinitam has fewer exposed thiols than Enbrel, although DTNB method, circular dichroism (CD), fluorescence (TCSPC), and activity (TNF α neutralization) showed no differences. Overall, this data revealed the capabilities and drawbacks of each thiol quantification technique and their correlation with protein structure.

  13. Comparability of a Three-Dimensional Structure in Biopharmaceuticals Using Spectroscopic Methods

    PubMed Central

    Abad-Javier, Mario E.; Romero-Díaz, Alexis J.; Villaseñor-Ortega, Francisco; Pérez, Néstor O.; Flores-Ortiz, Luis F.

    2014-01-01

    Protein structure depends on weak interactions and covalent bonds, like disulfide bridges, established according to the environmental conditions. Here, we present the validation of two spectroscopic methodologies for the measurement of free and unoxidized thiols, as an attribute of structural integrity, using 5,5′-dithionitrobenzoic acid (DTNB) and DyLight Maleimide (DLM) as derivatizing agents. These methods were used to compare Rituximab and Etanercept products from different manufacturers. Physicochemical comparability was demonstrated for Rituximab products as DTNB showed no statistical differences under native, denaturing, and denaturing-reducing conditions, with Student's t-test P values of 0.6233, 0.4022, and 0.1475, respectively. While for Etanercept products no statistical differences were observed under native (P = 0.0758) and denaturing conditions (P = 0.2450), denaturing-reducing conditions revealed cysteine contents of 98% and 101%, towards the theoretical value of 58, for the evaluated products from different Etanercept manufacturers. DLM supported equality between Rituximab products under native (P = 0.7499) and denaturing conditions (P = 0.8027), but showed statistical differences among Etanercept products under native conditions (P < 0.001). DLM suggested that Infinitam has fewer exposed thiols than Enbrel, although DTNB method, circular dichroism (CD), fluorescence (TCSPC), and activity (TNFα neutralization) showed no differences. Overall, this data revealed the capabilities and drawbacks of each thiol quantification technique and their correlation with protein structure. PMID:24963443

  14. Zebra: a web server for bioinformatic analysis of diverse protein families.

    PubMed

    Suplatov, Dmitry; Kirilin, Evgeny; Takhaveev, Vakil; Svedas, Vytas

    2014-01-01

    During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology - understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure-function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server - a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure-function relationship. Zebra results are provided in two ways - (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra .

  15. Bioinformatic scaling of allosteric interactions in biomedical isozymes

    NASA Astrophysics Data System (ADS)

    Phillips, J. C.

    2016-09-01

    Allosteric (long-range) interactions can be surprisingly strong in proteins of biomedical interest. Here we use bioinformatic scaling to connect prior results on nonsteroidal anti-inflammatory drugs to promising new drugs that inhibit cancer cell metabolism. Many parallel features are apparent, which explain how even one amino acid mutation, remote from active sites, can alter medical results. The enzyme twins involved are cyclooxygenase (aspirin) and isocitrate dehydrogenase (IDH). The IDH results are accurate to 1% and are overdetermined by adjusting a single bioinformatic scaling parameter. It appears that the final stage in optimizing protein functionality may involve leveling of the hydrophobic limits of the arms of conformational hydrophilic hinges.

  16. Genomics and bioinformatics resources for translational science in Rosaceae.

    PubMed

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  17. Comparative metabolomics and structural characterizations illuminate colibactin pathway-dependent small molecules.

    PubMed

    Vizcaino, Maria I; Engel, Philipp; Trautman, Eric; Crawford, Jason M

    2014-07-01

    The gene cluster responsible for synthesis of the unknown molecule "colibactin" has been identified in mutualistic and pathogenic Escherichia coli. The pathway endows its producer with a long-term persistence phenotype in the human bowel, a probiotic activity used in the treatment of ulcerative colitis, and a carcinogenic activity under host inflammatory conditions. To date, functional small molecules from this pathway have not been reported. Here we implemented a comparative metabolomics and targeted structural network analyses approach to identify a catalog of small molecules dependent on the colibactin pathway from the meningitis isolate E. coli IHE3034 and the probiotic E. coli Nissle 1917. The structures of 10 pathway-dependent small molecules are proposed based on structural characterizations and network relationships. The network will provide a roadmap for the structural and functional elucidation of a variety of other small molecules encoded by the pathway. From the characterized small molecule set, in vitro bacterial growth inhibitory and mammalian CNS receptor antagonist activities are presented. PMID:24932672

  18. Comparative analysis of the friction stir welded aluminum-magnesium alloy joint grain structure

    NASA Astrophysics Data System (ADS)

    Zaikina, A. A.; Sizova, O. V.; Novitskaya, O. S.

    2015-10-01

    A comparative test of the friction stir welded aluminum-magnesium alloy joint microstructure for plates of a different thickness was carried out. Finding out the structuring regularities in the weld nugget zone, that is the strongest zone of the weld, the effects of temperature-deformational conditions on the promotion of a metal structure refinement mechanism under friction stir welding can be determined. In this research friction stir welded rolled plates of an AMg5M alloy; 5 and 8 mm thick were investigated. Material fine structure pictures of the nugget zone were used to identify and measure subgrain and to define a second phase location. By means of optical microscopy it was shown that the fine-grained structure developed in the nugget zone. The grain size was 5 flm despite the thickness of the plates. In the sample 5.0 mm thick grains were coaxial, while in the sample 8.0 mm thick grains were elongate at a certain angle to the tool travel direction.

  19. A comparative structure-function analysis of active-site inhibitors of Vibrio cholerae cholix toxin.

    PubMed

    Lugo, Miguel R; Merrill, A Rod

    2015-09-01

    Cholix toxin from Vibrio cholerae is a novel mono-ADP-ribosyltransferase (mART) toxin that shares structural and functional properties with Pseudomonas aeruginosa exotoxin A and Corynebacterium diphtheriae diphtheria toxin. Herein, we have used the high-resolution X-ray structure of full-length cholix toxin in the apo form, NAD(+) bound, and 10 structures of the cholix catalytic domain (C-domain) complexed with several strong inhibitors of toxin enzyme activity (NAP, PJ34, and the P-series) to study the binding mode of the ligands. A pharmacophore model based on the active pose of NAD(+) was compared with the active conformation of the inhibitors, which revealed a cationic feature in the side chain of the inhibitors that may determine the active pose. Moreover, a conformational search was conducted for the missing coordinates of one of the main active-site loops (R-loop). The resulting structural models were used to evaluate the interaction energies and for 3D-QSAR modeling. Implications for a rational drug design approach for mART toxins were derived.

  20. Structural violence in long-term, residential care for older people: comparing Canada and Scandinavia.

    PubMed

    Banerjee, Albert; Daly, Tamara; Armstrong, Pat; Szebehely, Marta; Armstrong, Hugh; Lafrance, Stirling

    2012-02-01

    Canadian frontline careworkers are six times more likely to experience daily physical violence than their Scandinavian counterparts. This paper draws on a comparative survey of residential careworkers serving older people across three Canadian provinces (Manitoba, Nova Scotia, Ontario) and four countries that follow a Scandinavian model of social care (Denmark, Finland, Norway, Sweden) conducted between 2005 and 2006. Ninety percent of Canadian frontline careworkers experienced physical violence from residents or their relatives and 43 percent reported physical violence on a daily basis. Canadian focus groups conducted in 2007 reveal violence was often normalized as an inevitable part of elder-care. We use the concept of "structural violence" (Galtung, 1969) to raise questions about the role that systemic and organizational factors play in setting the context for violence. Structural violence refers to indirect forms of violence that are built into social structures and that prevent people from meeting their basic needs or fulfilling their potential. We applied the concept to long-term residential care and found that the poor quality of the working conditions and inadequate levels of support experienced by Canadian careworkers constitute a form of structural violence. Working conditions are detrimental to careworker's physical and mental health, and prevent careworkers from providing the quality of care they are capable of providing and understand to be part of their job. These conditions may also contribute to the physical violence workers experience, and further investigation is warranted. PMID:22204839

  1. Comparing posttraumatic stress disorder's symptom structure between deployed and nondeployed veterans.

    PubMed

    Engdahl, Ryan M; Elhai, Jon D; Richardson, J Don; Frueh, B Christopher

    2011-03-01

    We tested two empirically validated 4-factor models of posttraumatic stress disorder (PTSD) symptoms using the PTSD Checklist: King, Leskin, King, and Weathers' (1998) model including reexperiencing, avoidance, emotional numbing, and hyperarousal factors, and Simms, Watson, and Doebbeling's (2002) model including reexperiencing, avoidance, dysphoria, and hyperarousal. Our aim was to determine which fit better in two groups of military veterans: peacekeepers previously deployed to a war zone (deployed group) and those trained for peacekeeping operations who were not deployed (nondeployed group). We compared the groups using multigroup confirmatory factor analysis. Adequate model fit was demonstrated among the nondeployed group, with no significant difference between King et al.'s (1998) model (separating avoidance and numbing) and Simms et al.'s (2002) similar model involving a dysphoria factor. A better fitting factor structure consistent with Simms et al.'s (2002) model was found in the deployed group. Comprehensive measurement invariance testing demonstrated significant differences between the deployed and nondeployed groups on all structural parameters, except observed variable intercepts (thus indicating similarities only in PTSD item severity). These findings add to researchers' understanding of PTSD's factor structure, given the revision of PTSD that will appear in the forthcoming 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (American Psychiatric Association, 2010)--namely, that the factor structure may be quite different between groups with and without exposure to major traumatic events. PMID:21171785

  2. Comparing the organisational structure of the preoperative assessment clinic at eight university hospitals.

    PubMed

    Edward, G M; Biervliet, J D; Hollmann, M W; Schlack, W S; Preckel, B

    2008-01-01

    The preoperative assessment clinic (PAC) has been implemented in most major hospitals. However, there is no uniformity in the way PACs are organised. We compared the organisational structure of the PACs from all eight university hospitals in The Netherlands, looking at the following variables: number of patients visiting the PAC, staffing of the PAC, opening hours, scheduling, and additional preoperative diagnostic testing. The number of patients seen yearly varies from 7.000 to 13.500. In all clinics, the preoperative assessment was performed by anaesthetists and residents. In five PACs, preoperative assessment was also performed by physician assistants or nurse practitioners. Opening hours varied. Consultations are by appointment, 'walk-in', or a combination of these two. In four clinics additional testing is performed at the PAC itself. This study shows that the organisational structure of the PAC at similar university hospitals varies greatly; this can have important implications when designing a benchmarking process.

  3. Comparative analysis of structural transformations of two bituminous coals with different maximum fluidity during carbonization

    SciTech Connect

    Valentina Zubkova; Victor Prezhdo; Andrzej Strojwas

    2007-06-15

    The variation of the volume of two bituminous coals with different maximum fluidity (MF) values has been determined using carbonization tests, and the quality of coke obtained has been examined using scanning electron microscopy (SEM) micrographs. The structural and chemical changes in bituminous coals at the pre-plastic stage during carbonization were studied using X-ray diffraction (XRD) and Fourier transform infrared (FTIR) techniques and compared to the changes in their electric and dielectric parameters. It was observed that the structural and chemical transformations occurred in the disordered phase of both coals in different ways. These differences are attributed to the different redistributions of hydrogen between the radicals generated in the aliphatic and aromatic parts of the macromolecule fragments. 42 refs., 12 figs., 2 tabs.

  4. Comparative studies on the antigenic structures of five subspecies of Oncomelania snails by immunoelectrophoresis.

    PubMed

    Iwanaga, Y

    1997-03-01

    To approach the biochemical relationships of five subspecies of Oncomelania snails, antigenic structures among the subspecies were compared using immunoelectrophoresis. The results obtained are summarized as follows: 1) For five subspecies of Oncomelania hupensis snails (Oncomelania hupensis hupensis, O.h.nosophora, O.h. formosana, O.h. chiui and O.h. quadrasi), 23-24 precipitin bands were observed between the antigens and their homologous antisera, while 18-22 bands were observed in the heterologous reactions. 2) For each subspecies, residual bands observed after absorption procedure demonstrated the presence of antigens unique to each subspecies except O.h. chiui. Based on the immunological antigenic structures among the Oncomelania subspecies, it is suggested that O.h. nosophora and O.h. hupensis forms are closely related group, while O.h. formosana, O.h. chiui and O.h. quadrasi forms are another group.

  5. Comparative study of two structures of shunt active filter suppressing particular harmonics

    NASA Astrophysics Data System (ADS)

    Benchaita, L.; Salem Nia, A.; Saadate, S.

    1998-07-01

    This paper deals with the study of shunt active filters used for suppressing particular harmonics generated by nonlinear loads in utility distribution power systems. Both structures of shunt active filter, voltage source active filter (VSAF) and current source active filter (CSAF), are considered. The analytical study of specific harmonics identification in a given spectrum is first presented. For simulation as well as experimentation the nonlinear load is a conventional three phase thyristor rectifier and harmonics 5 and 7 are selected to be eliminated by active filter. The whole system consisting of the ac power supply network, the SCR rectifier and the shunt active filter (VSAF/CSAF) is then simulated. The simulation results are discussed and the efficiency of the two kinds of active filter are compared. Finally, for the first structure, VSAF, the simulation results are confirmed by experimental test realized by means of a fully digital control active power filter developed in our laboratory.

  6. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    PubMed

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  7. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    PubMed Central

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  8. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    PubMed

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  9. [Comparative analysis of the genetic structure of Red Polish cattle in Poland and the Ukraine].

    PubMed

    Oblap, R V; Zvezhkhovski, L; Ivanchenko, E V; Glazko, V I

    2002-01-01

    Comparative analysis of genetic structure of two groups of Red Polish cattle, which reproduce in Poland and Ukraine, was made. Six molecular-genetic markers (kappa-casein, beta-lactoglobulin, leptin, myostatin, growth hormone, and pituitary-specific transcription factor Pit-I) were tested by PCR-RFLP. No significant differences between the considered intrabreed groups were found. High frequency of some alleles (Csn kappa B, Blg B, and Gh L) related to the important productivity traits were observed. The rare alleles in some genes were revealed. The obtained results are evidence of the unique characteristics of the investigated breed.

  10. A comparative study of the inner ear structures of artiodactyls and early cetaceans

    SciTech Connect

    Klingshirn, M.A.; Luo, Z.

    1994-12-31

    It has been suggested that the order Cetacea (whales and porpoises) are closely related to artiodactyls, even-hoofed ungulate mammals such as the pig and cow. Paleontological and molecular data strongly supports this concept of phylogenetic relationships. In a study of DNA sequences of two mitochondrial ribosomal gene segments of cetaceans, the artiodactyls were found to be closest related to Cetaceans. These well accepted studies on the phylogenetic affinities of artiodactyls and cetaceans cause us to conduct a comparative study of the bony structure of the inner ear of these two taxa.

  11. Comparing two iteration algorithms of Broyden electron density mixing through an atomic electronic structure computation

    NASA Astrophysics Data System (ADS)

    Man-Hong, Zhang

    2016-05-01

    By performing the electronic structure computation of a Si atom, we compare two iteration algorithms of Broyden electron density mixing in the literature. One was proposed by Johnson and implemented in the well-known VASP code. The other was given by Eyert. We solve the Kohn-Sham equation by using a conventional outward/inward integration of the differential equation and then connect two parts of solutions at the classical turning points, which is different from the method of the matrix eigenvalue solution as used in the VASP code. Compared to Johnson’s algorithm, the one proposed by Eyert needs fewer total iteration numbers. Project supported by the National Natural Science Foundation of China (Grant No. 61176080).

  12. Globin genes transcriptional switching, chromatin structure and linked lessons to epigenetics in cancer: a comparative overview.

    PubMed

    Guerrero, Georgina; Delgado-Olguín, Paul; Escamilla-Del-Arenal, Martín; Furlan-Magaril, Mayra; Rebollar, Eria; De La Rosa-Velázquez, Inti A; Soto-Reyes, Ernesto; Rincón-Arano, Héctor; Valdes-Quezada, Christian; Valadez-Graham, Viviana; Recillas-Targa, Félix

    2007-07-01

    At the present time research situates differential regulation of gene expression in an increasingly complex scenario based on interplay between genetic and epigenetic information networks, which need to be highly coordinated. Here we describe in a comparative way relevant concepts and models derived from studies on the chicken alpha- and beta-globin group of genes. We discuss models for globin switching and mechanisms for coordinated transcriptional activation. A comparative overview of globin genes chromatin structure, based on their genomic domain organization and epigenetic components is presented. We argue that the results of those studies and their integrative interpretation may contribute to our understanding of epigenetic abnormalities, from beta-thalassemias to human cancer. Finally we discuss the interdependency of genetic-epigenetic components and the need of their mutual consideration in order to visualize the regulation of gene expression in a more natural context and consequently better understand cell differentiation, development and cancer.

  13. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    SciTech Connect

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics into

  14. Comparative Analysis of Data Structures for Storing Massive Tins in a Dbms

    NASA Astrophysics Data System (ADS)

    Kumar, K.; Ledoux, H.; Stoter, J.

    2016-06-01

    Point cloud data are an important source for 3D geoinformation. Modern day 3D data acquisition and processing techniques such as airborne laser scanning and multi-beam echosounding generate billions of 3D points for simply an area of few square kilometers. With the size of the point clouds exceeding the billion mark for even a small area, there is a need for their efficient storage and management. These point clouds are sometimes associated with attributes and constraints as well. Storing billions of 3D points is currently possible which is confirmed by the initial implementations in Oracle Spatial SDO PC and the PostgreSQL Point Cloud extension. But to be able to analyse and extract useful information from point clouds, we need more than just points i.e. we require the surface defined by these points in space. There are different ways to represent surfaces in GIS including grids, TINs, boundary representations, etc. In this study, we investigate the database solutions for the storage and management of massive TINs. The classical (face and edge based) and compact (star based) data structures are discussed at length with reference to their structure, advantages and limitations in handling massive triangulations and are compared with the current solution of PostGIS Simple Feature. The main test dataset is the TIN generated from third national elevation model of the Netherlands (AHN3) with a point density of over 10 points/m2. PostgreSQL/PostGIS DBMS is used for storing the generated TIN. The data structures are tested with the generated TIN models to account for their geometry, topology, storage, indexing, and loading time in a database. Our study is useful in identifying what are the limitations of the existing data structures for storing massive TINs and what is required to optimise these structures for managing massive triangulations in a database.

  15. Structural and compositional changes in erythrocyte membrane of obese compared to normal-weight adolescents.

    PubMed

    Perona, Javier S; González-Jiménez, Emilio; Aguilar-Cordero, María J; Sureda, Antonio; Barceló, Francisca

    2013-12-01

    Unhealthy dietary habits are key determinants of obesity in adolescents. Assuming that dietary fat profile influences membrane lipid composition, the aim of this study was to analyze structural changes in the erythrocyte membrane of obese compared to normal-weight adolescents. The study was conducted in a group of 11 obese and 11 normal-weight adolescent subjects. The lipid profile, lipid peroxidation and acetylcholinesterase enzyme (AChE) activity were analyzed by conventional methods. The structural properties of reconstituted erythrocyte membrane were characterized by X-ray diffraction. Erythrocyte membrane from obese adolescents had a lipid profile characterized by a higher cholesterol/phospholipid ratio, an increase in saturated fatty acid and a decrease in monounsaturated and n-6 polyunsaturated fatty acid concentrations. Differences in lipid content were associated with changes in the structural properties of reconstituted membranes and the oxidative damage of erythrocyte membrane. The lower oxidative level shown in the obese group (0.15 ± 0.04 vs. 0.20 ± 0.06 nmol/mg for conjugated diene concentrations and 2.43 ± 0.25 vs. 2.83 ± 0.31 nmol/mg protein for malondialdehyde levels) was related to a lower unsaturation index. These changes in membrane structural properties were accompanied by a lower AChE activity (1.64 ± 0.13 vs. 1.91 ± 0.24 nmol AChE/[min mg protein]) in the obese group. The consequences of unhealthy dietary habits in adolescents are reflected in the membrane structural properties and may influence membrane-associated protein activities and functions.

  16. Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations

    PubMed Central

    Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W.

    2016-01-01

    Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures. PMID:26904094

  17. Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations.

    PubMed

    Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W

    2016-01-01

    Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures. PMID:26904094

  18. Comparative study of local structure of two cyanobiphenyl liquid crystals by molecular dynamics method

    SciTech Connect

    Gerts, Egor D. Komolkin, Andrei V.; Burmistrov, Vladimir A.; Alexandriysky, Victor V.; Dvinskikh, Sergey V.

    2014-08-21

    Fully-atomistic molecular dynamics simulations were carried out on two similar cyanobiphenyl nematogens, HO-6OCB and 7OCB, in order to study effects of hydrogen bonds on local structure of liquid crystals. Comparable length of these two molecules provides more evident results on the effects of hydrogen bonding. The analysis of radial and cylindrical distribution functions clearly shows the differences in local structure of two mesogens. The simulations showed that anti-parallel alignment is preferable for the HO-6OCB. Hydrogen bonds between OH-groups are observed for 51% of HO-6OCB molecules, while hydrogen bonding between CN- and OH-groups occurs only for 16% of molecules. The lifetimes of H-bonds differ due to different mobility of molecular fragments (50 ps for N⋅⋅⋅H–O and 41 ps for O⋅⋅⋅H–O). Although the standard Optimized Potentials for Liquid Simulations - All-Atom force field cannot reproduce some experimental parameters quantitatively (order parameters are overestimated, diffusion coefficients are not reproduced well), the comparison of relative simulated results for the pair of mesogens is nevertheless consistent with the same relative experimental parameters. Thus, the comparative study of simulated and experimental results for the pair of similar liquid crystals still can be assumed plausible.

  19. Gender differences in structured risk assessment: comparing the accuracy of five instruments.

    PubMed

    Coid, Jeremy; Yang, Min; Ullrich, Simone; Zhang, Tianqiang; Sizmur, Steve; Roberts, Colin; Farrington, David P; Rogers, Robert D

    2009-04-01

    Structured risk assessment should guide clinical risk management, but it is uncertain which instrument has the highest predictive accuracy among men and women. In the present study, the authors compared the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991, 2003); the Historical, Clinical, Risk Management-20 (HCR-20; C. D. Webster, K. S. Douglas, D. Eaves, & S. D. Hart, 1997); the Risk Matrix 2000-Violence (RM2000[V]; D. Thornton et al., 2003); the Violence Risk Appraisal Guide (VRAG; V. L. Quinsey, G. T. Harris, M. E. Rice, & C. A. Cormier, 1998); the Offenders Group Reconviction Scale (OGRS; J. B. Copas & P. Marshall, 1998; R. Taylor, 1999); and the total previous convictions among prisoners, prospectively assessed prerelease. The authors compared predischarge measures with subsequent offending and instruments ranked using multivariate regression. Most instruments demonstrated significant but moderate predictive ability. The OGRS ranked highest for violence among men, and the PCL-R and HCR-20 H subscale ranked highest for violence among women. The OGRS and total previous acquisitive convictions demonstrated greatest accuracy in predicting acquisitive offending among men and women. Actuarial instruments requiring no training to administer performed as well as personality assessment and structured risk assessment and were superior among men for violence.

  20. The leader peptide of mutacin 1140 has distinct structural components compared to related class I lantibiotics

    PubMed Central

    Escano, Jerome; Stauffer, Byron; Brennan, Jacob; Bullock, Monica; Smith, Leif

    2014-01-01

    Lantibiotics are ribosomally synthesized peptide antibiotics composed of an N-terminal leader peptide that promotes the core peptide's interaction with the post translational modification (PTM) enzymes. Following PTMs, mutacin 1140 is transported out of the cell and the leader peptide is cleaved to yield the antibacterial peptide. Mutacin 1140 leader peptide is structurally unique compared to other class I lantibiotic leader peptides. Herein, we further our understanding of the structural differences of mutacin 1140 leader peptide with regard to other class I leader peptides. We have determined that the length of the leader peptide is important for the biosynthesis of mutacin 1140. We have also determined that mutacin 1140 leader peptide contains a novel four amino acid motif compared to related lantibiotics. PTM enzyme recognition of the leader peptide appears to be evolutionarily distinct from related class I lantibiotics. Our study on mutacin 1140 leader peptide provides a basis for future studies aimed at understanding its interaction with the PTM enzymes. PMID:25400246

  1. Physcomitrella HMGA-type proteins display structural differences compared to their higher plant counterparts

    SciTech Connect

    Lyngaard, Carina; Stemmer, Christian; Stensballe, Allan; Graf, Manuela; Gorr, Gilbert; Decker, Eva; Grasser, Klaus D.

    2008-10-03

    High mobility group (HMG) proteins of the HMGA family are chromatin-associated proteins that act as architectural factors in nucleoprotein structures involved in gene transcription. To date, HMGA-type proteins have been studied in various higher plant species, but not in lower plants. We have identified two HMGA-type proteins, HMGA1 and HMGA2, encoded in the genome of the moss model Physcomitrella patens. Compared to higher plant HMGA proteins, the two Physcomitrella proteins display some structural differences. Thus, the moss HMGA proteins have six (rather than four) AT-hook DNA-binding motifs and their N-terminal domain lacks similarity to linker histone H1. HMGA2 is expressed in moss protonema and it localises to the cell nucleus. Typical of HMGA proteins, HMGA2 interacts preferentially with A/T-rich DNA, when compared with G/C-rich DNA. In cotransformation assays in Physcomitrella protoplasts, HMGA2 stimulated reporter gene expression. In summary, our data show that functional HMGA-type proteins occur in Physcomitrella.

  2. Exalign: a new method for comparative analysis of exon-intron gene structures.

    PubMed

    Pavesi, Giulio; Zambelli, Federico; Caggese, Corrado; Pesole, Graziano

    2008-05-01

    The evolution of genes is usually studied and reconstructed at the sequence level, that is, by comparing and aligning their genomic, transcript or protein sequences. However, including the exon-intron structure of genes in the analysis can provide further and useful information, for example to draw reliable phylogenetic relationships left unsolved by traditional sequence-based evolutionary studies, or to shed further light on patterns of intron gain and loss. In spite of this, no tool especially devised for this task is currently available. In this work we present Exalign, an algorithm designed to retrieve, compare and search for the exon-intron structure of existing gene annotations, that has been implemented in a software tool freely accessible through a web interface as well as available for download. We present different applications of our method, from the reconstruction of the evolutionary history of homologous gene families to the detection of as of today unknown cases of intron loss in human and rodents, and, remarkably, two never reported intron gain events in human and mouse. The web interface for accessing Exalign is available at http://www.pesolelab.it/exalign/ or http://www.beacon.unimi.it/exalign/

  3. The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs

    PubMed Central

    2002-01-01

    Background Comparative analysis of RNA sequences is the basis for the detailed and accurate predictions of RNA structure and the determination of phylogenetic relationships for organisms that span the entire phylogenetic tree. Underlying these accomplishments are very large, well-organized, and processed collections of RNA sequences. This data, starting with the sequences organized into a database management system and aligned to reveal their higher-order structure, and patterns of conservation and variation for organisms that span the phylogenetic tree, has been collected and analyzed. This type of information can be fundamental for and have an influence on the study of phylogenetic relationships, RNA structure, and the melding of these two fields. Results We have prepared a large web site that disseminates our comparative sequence and structure models and data. The four major types of comparative information and systems available for the three ribosomal RNAs (5S, 16S, and 23S rRNA), transfer RNA (tRNA), and two of the catalytic intron RNAs (group I and group II) are: (1) Current Comparative Structure Models; (2) Nucleotide Frequency and Conservation Information; (3) Sequence and Structure Data; and (4) Data Access Systems. Conclusions This online RNA sequence and structure information, the result of extensive analysis, interpretation, data collection, and computer program and web development, is accessible at our Comparative RNA Web (CRW) Site http://www.rna.icmb.utexas.edu. In the future, more data and information will be added to these existing categories, new categories will be developed, and additional RNAs will be studied and presented at the CRW Site. PMID:11869452

  4. Bioinformatics Education—Perspectives and Challenges out of Africa

    PubMed Central

    Adebiyi, Ezekiel F.; Alzohairy, Ahmed M.; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J.; Panji, Sumir; Patterton, Hugh-G.

    2015-01-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  5. An International Bioinformatics Infrastructure to Underpin the Arabidopsis Community

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering C...

  6. Broad issues to consider for library involvement in bioinformatics*

    PubMed Central

    Geer, Renata C.

    2006-01-01

    Background: The information landscape in biological and medical research has grown far beyond literature to include a wide variety of databases generated by research fields such as molecular biology and genomics. The traditional role of libraries to collect, organize, and provide access to information can expand naturally to encompass these new data domains. Methods: This paper discusses the current and potential role of libraries in bioinformatics using empirical evidence and experience from eleven years of work in user services at the National Center for Biotechnology Information. Findings: Medical and science libraries over the last decade have begun to establish educational and support programs to address the challenges users face in the effective and efficient use of a plethora of molecular biology databases and retrieval and analysis tools. As more libraries begin to establish a role in this area, the issues they face include assessment of user needs and skills, identification of existing services, development of plans for new services, recruitment and training of specialized staff, and establishment of collaborations with bioinformatics centers at their institutions. Conclusions: Increasing library involvement in bioinformatics can help address information needs of a broad range of students, researchers, and clinicians and ultimately help realize the power of bioinformatics resources in making new biological discoveries. PMID:16888662

  7. Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics.

    PubMed

    Verheggen, Kenneth; Maddelein, Davy; Hulstaert, Niels; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2016-03-01

    The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries. PMID:26510693

  8. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    ERIC Educational Resources Information Center

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  9. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    ERIC Educational Resources Information Center

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  10. A quick guide for building a successful bioinformatics community.

    PubMed

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-02-01

    "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB).

  11. Bioinformatics for Undergraduates: Steps toward a Quantitative Bioscience Curriculum

    ERIC Educational Resources Information Center

    Chapman, Barbara S.; Christmann, James L.; Thatcher, Eileen F.

    2006-01-01

    We describe an innovative bioinformatics course developed under grants from the National Science Foundation and the California State University Program in Research and Education in Biotechnology for undergraduate biology students. The project has been part of a continuing effort to offer students classroom experiences focused on principles and…

  12. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    EPA Science Inventory

    A Bioinformatic Strategy to Rapidly Characterize cDNA Libraries

    G. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.
    1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  13. An evaluation of ontology exchange languages for bioinformatics.

    PubMed

    McEntire, R; Karp, P; Abernethy, N; Benton, D; Helt, G; DeJongh, M; Kent, R; Kosky, A; Lewis, S; Hodnett, D; Neumann, E; Olken, F; Pathak, D; Tarczy-Hornoch, P; Toldo, L; Topaloglou, T

    2000-01-01

    Ontologies are specifications of the concepts in a given field, and of the relationships among those concepts. The development of ontologies for molecular-biology information and the sharing of those ontologies within the bioinformatics community are central problems in bioinformatics. If the bioinformatics community is to share ontologies effectively, ontologies must be exchanged in a form that uses standardized syntax and semantics. This paper reports on an effort among the authors to evaluate alternative ontology-exchange languages, and to recommend one or more languages for use within the larger bioinformatics community. The study selected a set of candidate languages, and defined a set of capabilities that the ideal ontology-exchange language should satisfy. The study scored the languages according to the degree to which they satisfied each capability. In addition, the authors performed several ontology-exchange experiments with the two languages that received the highest scores: OML and Ontolingua. The result of those experiments, and the main conclusion of this study, was that the frame-based semantic model of Ontolingua is preferable to the conceptual graph model of OML, but that the XML-based syntax of OML is preferable to the Lisp-based syntax of Ontolingua. PMID:10977085

  14. OILing the way to machine understandable bioinformatics resources.

    PubMed

    Stevens, Robert; Goble, Carole; Horrocks, Ian; Bechhofer, Sean

    2002-06-01

    The complex questions and analyses posed by biologists, as well as the diverse data resources they develop, require the fusion of evidence from different, independently developed, and heterogeneous resources. The web, as an enabler for interoperability, has been an excellent mechanism for data publication and transportation. Successful exchange and integration of information, however, depends on a shared language for communication (a terminology) and a shared understanding of what the data means (an ontology). Without this kind of understanding, semantic heterogeneity remains a problem for both humans and machines. One means of dealing with heterogeneity in bioinformatics resources is through terminology founded upon an ontology. Bioinformatics resources tend to be rich in human readable and understandable annotation, with each resource using its own terminology. These resources are machine readable, but not machine understandable. Ontologies have a role in increasing this machine understanding, reducing the semantic heterogeneity between resources and thus promoting the flexible and reliable interoperation of bioinformatics resources. This paper describes a solution derived from the semantic web [a machine understandable world-wide web (WWW)], the ontology inference layer (OIL), as a solution for semantic bioinformatics resources. The nature of the heterogeneity problems are presented along with a description of how metadata from domain ontologies can be used to alleviate this problem. A companion paper in this issue gives an example of the development of a bio-ontology using OIL.

  15. Incorporation of Bioinformatics Exercises into the Undergraduate Biochemistry Curriculum

    ERIC Educational Resources Information Center

    Feig, Andrew L.; Jabri, Evelyn

    2002-01-01

    The field of bioinformatics is developing faster than most biochemistry textbooks can adapt. Supplementing the undergraduate biochemistry curriculum with data-mining exercises is an ideal way to expose the students to the common databases and tools that take advantage of this vast repository of biochemical information. An integrated collection of…

  16. Comparative Evaluation for Brain Structural Connectivity Approaches: Towards Integrative Neuroinformatics Tool for Epilepsy Clinical Research.

    PubMed

    Yang, Sheng; Tatsuoka, Curtis; Ghosh, Kaushik; Lacuey-Lecumberri, Nuria; Lhatoo, Samden D; Sahoo, Satya S

    2016-01-01

    Recent advances in brain fiber tractography algorithms and diffusion Magnetic Resonance Imaging (MRI) data collection techniques are providing new approaches to study brain white matter connectivity, which play an important role in complex neurological disorders such as epilepsy. Epilepsy affects approximately 50 million persons worldwide and it is often described as a disorder of the cortical network organization. There is growing recognition of the need to better understand the role of brain structural networks in the onset and propagation of seizures in epilepsy using high resolution non-invasive imaging technologies. In this paper, we perform a comparative evaluation of two techniques to compute structural connectivity, namely probabilistic fiber tractography and statistics derived from fractional anisotropy (FA), using diffusion MRI data from a patient with rare case of medically intractable insular epilepsy. The results of our evaluation demonstrate that probabilistic fiber tractography provides a more accurate map of structural connectivity and may help address inherent complexities of neural fiber layout in the brain, such as fiber crossings. This work provides an initial result towards building an integrative informatics tool for neuroscience that can be used to accurately characterize the role of fiber tract connectivity in neurological disorders such as epilepsy. PMID:27570685

  17. Functional silicene and stanene nanoribbons compared to graphene: electronic structure and transport

    NASA Astrophysics Data System (ADS)

    van den Broek, B.; Houssa, M.; Iordanidou, K.; Pourtois, G.; Afanas'ev, V. V.; Stesmans, A.

    2016-03-01

    Since the advent of graphene, other 2D materials have garnered interest; notably the single element materials silicene, germanene, and stanene. We investigate the ballistic current-voltage (I-V) characteristics of armchair silicene and stanene armchair nanoribbons (AXNRs with X = Si, Sn) using a combination of density functional theory and non-equilibrium Green’s functions. The impact of out-of-plane electric field and in-plane uniaxial strain on the ribbon geometries, electronic structure, and (I-V)s are considered and contrasted with graphene. Since silicene and stanene are sp2/sp3 buckled layers, the electronic structure can be tuned by an electric field that breaks the sublattice symmetry, an effect absent in graphene. This decreases the current by ˜50% for Sn, since it has the largest buckling. Uniaxial straining of the ballistic channel affects the AXNR electronic structure in multiple ways: it changes the bandgap and associated effective carrier mass, and creates a local buckling distortion at the lead-channel interface which induces a interface dipole. Due to the increasing sp3 hybridization character with increasing element mass, large reconstructions rectify the strained systems, an effect absent in sp2 bonded graphene. This results in a smaller strain effect on the current: a decrease of 20% for Sn at 15% tensile strain compared to a ˜75% decrease for C.

  18. Comparative Evaluation for Brain Structural Connectivity Approaches: Towards Integrative Neuroinformatics Tool for Epilepsy Clinical Research

    PubMed Central

    Yang, Sheng; Tatsuoka, Curtis; Ghosh, Kaushik; Lacuey-Lecumberri, Nuria; Lhatoo, Samden D.; Sahoo, Satya S.

    2016-01-01

    Recent advances in brain fiber tractography algorithms and diffusion Magnetic Resonance Imaging (MRI) data collection techniques are providing new approaches to study brain white matter connectivity, which play an important role in complex neurological disorders such as epilepsy. Epilepsy affects approximately 50 million persons worldwide and it is often described as a disorder of the cortical network organization. There is growing recognition of the need to better understand the role of brain structural networks in the onset and propagation of seizures in epilepsy using high resolution non-invasive imaging technologies. In this paper, we perform a comparative evaluation of two techniques to compute structural connectivity, namely probabilistic fiber tractography and statistics derived from fractional anisotropy (FA), using diffusion MRI data from a patient with rare case of medically intractable insular epilepsy. The results of our evaluation demonstrate that probabilistic fiber tractography provides a more accurate map of structural connectivity and may help address inherent complexities of neural fiber layout in the brain, such as fiber crossings. This work provides an initial result towards building an integrative informatics tool for neuroscience that can be used to accurately characterize the role of fiber tract connectivity in neurological disorders such as epilepsy. PMID:27570685

  19. Comparative Structural and Functional Analysis of Bunyavirus and Arenavirus Cap-Snatching Endonucleases.

    PubMed

    Reguera, Juan; Gerlach, Piotr; Rosenthal, Maria; Gaudon, Stephanie; Coscia, Francesca; Günther, Stephan; Cusack, Stephen

    2016-06-01

    Segmented negative strand RNA viruses of the arena-, bunya- and orthomyxovirus families uniquely carry out viral mRNA transcription by the cap-snatching mechanism. This involves cleavage of host mRNAs close to their capped 5' end by an endonuclease (EN) domain located in the N-terminal region of the viral polymerase. We present the structure of the cap-snatching EN of Hantaan virus, a bunyavirus belonging to hantavirus genus. Hantaan EN has an active site configuration, including a metal co-ordinating histidine, and nuclease activity similar to the previously reported La Crosse virus and Influenza virus ENs (orthobunyavirus and orthomyxovirus respectively), but is more active in cleaving a double stranded RNA substrate. In contrast, Lassa arenavirus EN has only acidic metal co-ordinating residues. We present three high resolution structures of Lassa virus EN with different bound ion configurations and show in comparative biophysical and biochemical experiments with Hantaan, La Crosse and influenza ENs that the isolated Lassa EN is essentially inactive. The results are discussed in the light of EN activation mechanisms revealed by recent structures of full-length influenza virus polymerase. PMID:27304209

  20. Comparative studies on photonic band structures of diamond and hexagonal diamond using the multiple scattering method

    NASA Astrophysics Data System (ADS)

    Chen, Hui; Zhang, Weiyi; Wang, Zhenlin

    2004-02-01

    Photonic band structures are investigated for both diamond and hexagonal diamond crystals composed of dielectric spheres, and absolute photonic band gaps (PBGs) are found in both cases. In agreement with both Karathanos and Moroz's calculations, a large PBG occurs between the eighth and ninth bands in diamond crystal, but a PBG in hexagonal diamond crystal is found to occur between the sixteenth and seventeenth bands because of the doubling of dielectric spheres in the primitive cell. To explore the physical mechanism of how the photonic band gap might be broadened, we have compared the electric field distributions (|E|2) of the 'valence' and 'conduction' band edges. Results show that the field intensity for the 'conduction' band locates in the inner core of the sphere while that of the 'valence' band concentrates in the outer shell. With this motivation, double-layer spheres are designed to enhance the corresponding photonic band gaps; the PBG is increased by 35% for the diamond structure, and 14% for the hexagonal diamond structure.

  1. Comparative Structural and Functional Analysis of Bunyavirus and Arenavirus Cap-Snatching Endonucleases

    PubMed Central

    Reguera, Juan; Gerlach, Piotr; Rosenthal, Maria; Gaudon, Stephanie; Coscia, Francesca; Günther, Stephan; Cusack, Stephen

    2016-01-01

    Segmented negative strand RNA viruses of the arena-, bunya- and orthomyxovirus families uniquely carry out viral mRNA transcription by the cap-snatching mechanism. This involves cleavage of host mRNAs close to their capped 5′ end by an endonuclease (EN) domain located in the N-terminal region of the viral polymerase. We present the structure of the cap-snatching EN of Hantaan virus, a bunyavirus belonging to hantavirus genus. Hantaan EN has an active site configuration, including a metal co-ordinating histidine, and nuclease activity similar to the previously reported La Crosse virus and Influenza virus ENs (orthobunyavirus and orthomyxovirus respectively), but is more active in cleaving a double stranded RNA substrate. In contrast, Lassa arenavirus EN has only acidic metal co-ordinating residues. We present three high resolution structures of Lassa virus EN with different bound ion configurations and show in comparative biophysical and biochemical experiments with Hantaan, La Crosse and influenza ENs that the isolated Lassa EN is essentially inactive. The results are discussed in the light of EN activation mechanisms revealed by recent structures of full-length influenza virus polymerase. PMID:27304209

  2. The company that words keep: comparing the statistical structure of child- versus adult-directed language.

    PubMed

    Hills, Thomas

    2013-06-01

    Does child-directed language differ from adult-directed language in ways that might facilitate word learning? Associative structure (the probability that a word appears with its free associates), contextual diversity, word repetitions and frequency were compared longitudinally across six language corpora, with four corpora of language directed at children aged 1.0 to 5.0, and two adult-directed corpora representing spoken and written language. Statistics were adjusted relative to shuffled corpora. Child-directed language was found to be more associative, repetitive and consistent than adult-directed language. Moreover, these statistical properties of child-directed language better predicted word acquisition than the same statistics in adult-directed language. Word frequency and repetitions were the best predictors within word classes (nouns, verbs, adjectives and function words). For all word classes combined, associative structure, contextual diversity and word repetitions best predicted language acquisition. These results support the hypothesis that child-directed language is structured in ways that facilitate language acquisition.

  3. Thermodynamic and structural insights into nanocomposites engineering by comparing two materials assembly techniques for graphene.

    PubMed

    Zhu, Jian; Zhang, Huanan; Kotov, Nicholas A

    2013-06-25

    Materials assembled by layer-by-layer (LBL) assembly and vacuum-assisted flocculation (VAF) have similarities, but a systematic study of their comparative advantages and disadvantages is missing. Such a study is needed from both practical and fundamental perspectives aiming at a better understanding of structure-property relationships of nanocomposites and purposeful engineering of materials with unique properties. Layered composites from polyvinyl alcohol (PVA) and reduced graphene (RG) are made by both techniques. We comparatively evaluate their structure, mechanical, and electrical properties. LBL and VAF composites demonstrate clear differences at atomic and nanoscale structural levels but reveal similarities in micrometer and submicrometer organization. Epitaxial crystallization and suppression of phase transition temperatures are more pronounced for PVA in LBL than for VAF composites. Mechanical properties are virtually identical for both assemblies at high RG contents. We conclude that mechanical properties in layered RG assemblies are largely determined by the thermodynamic state of PVA at the polymer/nanosheet interface rather than the nanometer scale differences in RG packing. High and nearly identical values of toughness for LBL and VAF composites reaching 6.1 MJ/m(3) observed for thermodynamically optimal composition confirm this conclusion. Their toughness is the highest among all other layered assemblies from RG, cellulose, clay, etc. Electrical conductivity, however, is more than 10× higher for LBL than for VAF composites for the same RG contents. Electrical properties are largely determined by the tunneling barrier between RG sheets and therefore strongly dependent on atomic/nanoscale organization. These findings open the door for application-oriented methods of materials engineering using both types of layered assemblies.

  4. Comparing standardized coefficients in structural equation modeling: a model reparameterization approach.

    PubMed

    Kwan, Joyce L Y; Chan, Wai

    2011-09-01

    We propose a two-stage method for comparing standardized coefficients in structural equation modeling (SEM). At stage 1, we transform the original model of interest into the standardized model by model reparameterization, so that the model parameters appearing in the standardized model are equivalent to the standardized parameters of the original model. At stage 2, we impose appropriate linear equality constraints on the standardized model and use a likelihood ratio test to make statistical inferences about the equality of standardized coefficients. Unlike other existing methods for comparing standardized coefficients, the proposed method does not require specific modeling features (e.g., specification of nonlinear constraints), which are available only in certain SEM software programs. Moreover, this method allows researchers to compare two or more standardized coefficients simultaneously in a standard and convenient way. Three real examples are given to illustrate the proposed method, using EQS, a popular SEM software program. Results show that the proposed method performs satisfactorily for testing the equality of standardized coefficients.

  5. Productivity and salinity structuring of the microplankton revealed by comparative freshwater metagenomics.

    PubMed

    Eiler, Alexander; Zaremba-Niedzwiedzka, Katarzyna; Martínez-García, Manuel; McMahon, Katherine D; Stepanauskas, Ramunas; Andersson, Siv G E; Bertilsson, Stefan

    2014-09-01

    Little is known about the diversity and structuring of freshwater microbial communities beyond the patterns revealed by tracing their distribution in the landscape with common taxonomic markers such as the ribosomal RNA. To address this gap in knowledge, metagenomes from temperate lakes were compared to selected marine metagenomes. Taxonomic analyses of rRNA genes in these freshwater metagenomes confirm the previously reported dominance of a limited subset of uncultured lineages of freshwater bacteria, whereas Archaea were rare. Diversification into marine and freshwater microbial lineages was also reflected in phylogenies of functional genes, and there were also significant differences in functional beta-diversity. The pathways and functions that accounted for these differences are involved in osmoregulation, active transport, carbohydrate and amino acid metabolism. Moreover, predicted genes orthologous to active transporters and recalcitrant organic matter degradation were more common in microbial genomes from oligotrophic versus eutrophic lakes. This comparative metagenomic analysis allowed us to formulate a general hypothesis that oceanic- compared with freshwater-dwelling microorganisms, invest more in metabolism of amino acids and that strategies of carbohydrate metabolism differ significantly between marine and freshwater microbial communities. PMID:24118837

  6. Bioclipse: an open source workbench for chemo- and bioinformatics

    PubMed Central

    Spjuth, Ola; Helmus, Tobias; Willighagen, Egon L; Kuhn, Stefan; Eklund, Martin; Wagener, Johannes; Murray-Rust, Peter; Steinbeck, Christoph; Wikberg, Jarl ES

    2007-01-01

    Background There is a need for software applications that provide users with a complete and extensible toolkit for chemo- and bioinformatics accessible from a single workbench. Commercial packages are expensive and closed source, hence they do not allow end users to modify algorithms and add custom functionality. Existing open source projects are more focused on providing a framework for integrating existing, separately installed bioinformatics packages, rather than providing user-friendly interfaces. No open source chemoinformatics workbench has previously been published, and no sucessful attempts have been made to integrate chemo- and bioinformatics into a single framework. Results Bioclipse is an advanced workbench for resources in chemo- and bioinformatics, such as molecules, proteins, sequences, spectra, and scripts. It provides 2D-editing, 3D-visualization, file format conversion, calculation of chemical properties, and much more; all fully integrated into a user-friendly desktop application. Editing supports standard functions such as cut and paste, drag and drop, and undo/redo. Bioclipse is written in Java and based on the Eclipse Rich Client Platform with a state-of-the-art plugin architecture. This gives Bioclipse an advantage over other systems as it can easily be extended with functionality in any desired direction. Conclusion Bioclipse is a powerful workbench for bio- and chemoinformatics as well as an advanced integration platform. The rich functionality, intuitive user interface, and powerful plugin architecture make Bioclipse the most advanced and user-friendly open source workbench for chemo- and bioinformatics. Bioclipse is released under Eclipse Public License (EPL), an open source license which sets no constraints on external plugin licensing; it is totally open for both open source plugins as well as commercial ones. Bioclipse is freely available at . PMID:17316423

  7. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    ERIC Educational Resources Information Center

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  8. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    ERIC Educational Resources Information Center

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  9. Design and Implementation of an Interdepartmental Bioinformatics Program across Life Science Curricula

    ERIC Educational Resources Information Center

    Miskowski, Jennifer A.; Howard, David R.; Abler, Michael L.; Grunwald, Sandra K.

    2007-01-01

    Over the past 10 years, there has been a technical revolution in the life sciences leading to the emergence of a new discipline called bioinformatics. In response, bioinformatics-related topics have been incorporated into various undergraduate courses along with the development of new courses solely focused on bioinformatics. This report describes…

  10. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    ERIC Educational Resources Information Center

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  11. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    ERIC Educational Resources Information Center

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  12. Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool

    PubMed Central

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N.; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. PMID:25742008

  13. Should we have blind faith in bioinformatics software? Illustrations from the SNAP web-based tool.

    PubMed

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any 'false positive' SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen's Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation.

  14. Comparing neural networks: a benchmark on growing neural gas, growing cell structures, and fuzzy ARTMAP.

    PubMed

    Heinke, D; Hamker, F H

    1998-01-01

    This article compares the performance of some recently developed incremental neural networks with the wellknown multilayer perceptron (MLP) on real-world data. The incremental networks are fuzzy ARTMAP (FAM), growing neural gas (GNG) and growing cell structures (GCS). The real-world datasets consist of four different datasets posing different challenges to the networks in terms of complexity of decision boundaries, overlapping between classes, and size of the datasets. The performance of the networks on the datasets is reported with respect to measure classification error, number of training epochs, and sensitivity toward variation of parameters. Statistical evaluations are applied to examine the significance of the results. The overall performance ranks in the following descending order: GNG, GCS, MLP, FAM. PMID:18255809

  15. Comparative toxicity and structure-activity in Chlorella and Tetrahymena: Monosubstituted phenols

    SciTech Connect

    Jaworska, J.S.; Schultz, T.W. )

    1991-07-01

    The relative toxicity of selected monosubstituted phenols has been assessed by Kramer and Truemper in the Chlorella vulgaris assay. The authors examined population growth inhibition of this simple green algae under short-term static conditions for 33 derivatives. However, efforts to develop a strong predictive quantitative structure-activity relationship (QSAR) met with limited success because they modeled across modes of toxic action or segregated derivatives such as positional isomers (i.e., ortho-, meta-, para-). In an effort to further their understanding of the relationships of ecotoxic effects of phenols, the authors have evaluated the same derivatives reported by Kramer and Truemper in the Tetrahymena pyriformis population growth assay, compared the responses in both systems and developed QSARs for the Chlorella vulgaris data based on mechanisms of action.

  16. Comparing of Normal Stress Distribution in Static and Dynamic Soil-Structure Interaction Analyses

    SciTech Connect

    Kholdebarin, Alireza; Massumi, Ali; Davoodi, Mohammad; Tabatabaiefar, Hamid Reza

    2008-07-08

    It is important to consider the vertical component of earthquake loading and inertia force in soil-structure interaction analyses. In most circumstances, design engineers are primarily concerned about the analysis of behavior of foundations subjected to earthquake-induced forces transmitted from the bedrock. In this research, a single rigid foundation with designated geometrical parameters located on sandy-clay soil has been modeled in FLAC software with Finite Different Method and subjected to three different vertical components of earthquake records. In these cases, it is important to evaluate effect of footing on underlying soil and to consider normal stress in soil with and without footing. The distribution of normal stress under the footing in static and dynamic states has been studied and compared. This Comparison indicated that, increasing in normal stress under the footing caused by vertical component of ground excitations, has decreased dynamic vertical settlement in comparison with static state.

  17. Comparative Study of 3-Dimensional Woven Joint Architectures for Composite Spacecraft Structures

    NASA Technical Reports Server (NTRS)

    Jones, Justin S.; Polis, Daniel L.; Rowles, Russell R.; Segal, Kenneth N.

    2011-01-01

    The National Aeronautics and Space Administration (NASA) Exploration Systems Mission Directorate initiated an Advanced Composite Technology (ACT) Project through the Exploration Technology Development Program in order to support the polymer composite needs for future heavy lift launch architectures. As an example, the large composite structural applications on Ares V inspired the evaluation of advanced joining technologies, specifically 3D woven composite joints, which could be applied to segmented barrel structures needed for autoclave cured barrel segments due to autoclave size constraints. Implementation of these 3D woven joint technologies may offer enhancements in damage tolerance without sacrificing weight. However, baseline mechanical performance data is needed to properly analyze the joint stresses and subsequently design/down-select a preform architecture. Six different configurations were designed and prepared for this study; each consisting of a different combination of warp/fill fiber volume ratio and preform interlocking method (Z-fiber, fully interlocked, or hybrid). Tensile testing was performed for this study with the enhancement of a dual camera Digital Image Correlation (DIC) system which provides the capability to measure full-field strains and three dimensional displacements of objects under load. As expected, the ratio of warp/fill fiber has a direct influence on strength and modulus, with higher values measured in the direction of higher fiber volume bias. When comparing the Z-fiber weave to a fully interlocked weave with comparable fiber bias, the Z-fiber weave demonstrated the best performance in two different comparisons. We report the measured tensile strengths and moduli for test coupons from the 6 different weave configurations under study.

  18. Amination of nitroazoles--a comparative study of structural and energetic properties.

    PubMed

    Zhao, Xiuxiu; Qi, Cai; Zhang, Lubo; Wang, Yuan; Li, Shenghua; Zhao, Fengqi; Pang, Siping

    2014-01-01

    In this work, 3-nitro-1H-1,2,4-triazole (1) and 3,5-dinitro-1H-pyrazole (2) were C-aminated and N-aminated using different amination agents, yielding their respective C-amino and N-amino products. All compounds were fully characterized by NMR (1H, 13C, 15N), IR spectroscopy, differential scanning calorimetry (DSC). X-ray crystallographic measurements were performed and delivered insight into structural characteristics as well as inter- and intramolecular interactions of the products. Their impact sensitivities were measured by using standard BAM fallhammer techniques and their explosive performances were computed using the EXPLO 5.05 program. A comparative study on the influence of those different amino substituents on the structural and energetic properties (such as density, stability, heat of formation, detonation performance) is presented. The results showed that the incorporation of an N-amino group into a nitroazole ring can improve nitrogen content, heat of formation and impact sensitivity, while the introduction of a C-amino group can enhance density, detonation velocity and pressure. The potential of N-amino and C-amino moieties for the design of next generation energetic materials is explored.

  19. Comparing Different Model Structures for Carbon Allocation in the Community Land Model (CLM)

    NASA Astrophysics Data System (ADS)

    Montane, F.; Fox, A. M.; Arellano, A. F.; Scaven, V. L.; Alexander, M. R.; Moore, D. J.

    2015-12-01

    Quantifying the intensity of feedback mechanisms between terrestrial ecosystems and climate is a central challenge for understanding the global carbon cycle. Part of this challenge includes understanding how climate affects not only NPP, but also C allocation in different plant tissues (leaves, stem and roots) which determines the C residence time. For instance, C could be sequestered over longer time periods if changes in climate increase allocation to long-lived plant tissue (e.g. woody components) with respect to short-lived tissues (e.g. leaves). Networks of eddy covariance towers like AmeriFlux provide the infrastructure necessary to study relationships between ecosystem processes and climate forcing. We ran the Community Land Model (CLM) for six temperate forests in North America (AmeriFlux sites) using different model structures for the C allocation module: i) standard carbon allocation module in CLM, which allocates C to the stem and leaves as a dynamic function of NPP and with fixed coefficients for the rest of parameters; ii) alternative C allocation module, which allocates C to the root and stem as a dynamic function of NPP and with fixed coefficients for the rest of parameters; and iii) alternative C allocation module with fixed coefficients for all the parameters. We compare C allocation patterns and climate sensitivities betwen the different model structures and available observations for the sites. We suggest some future approaches to reduce model uncertainty in the current scheme for C allocation in CLM and its climate sensitivity.

  20. Unexpected structural complexity of supernumerary marker chromosomes characterized by microarray comparative genomic hybridization

    PubMed Central

    Tsuchiya, Karen D; Opheim, Kent E; Hannibal, Mark C; Hing, Anne V; Glass, Ian A; Raff, Michael L; Norwood, Thomas; Torchia, Beth A

    2008-01-01

    Background Supernumerary marker chromosomes (SMCs) are structurally abnormal extra chromosomes that cannot be unambiguously identified by conventional banding techniques. In the past, SMCs have been characterized using a variety of different molecular cytogenetic techniques. Although these techniques can sometimes identify the chromosome of origin of SMCs, they are cumbersome to perform and are not available in many clinical cytogenetic laboratories. Furthermore, they cannot precisely determine the region or breakpoints of the chromosome(s) involved. In this study, we describe four patients who possess one or more SMCs (a total of eight SMCs in all four patients) that were characterized by microarray comparative genomic hybridization (array CGH). Results In at least one SMC from all four patients, array CGH uncovered unexpected complexity, in the form of complex rearrangements, that could have gone undetected using other molecular cytogenetic techniques. Although array CGH accurately defined the chromosome content of all but two minute SMCs, fluorescence in situ hybridization was necessary to determine the structure of the markers. Conclusion The increasing use of array CGH in clinical cytogenetic laboratories will provide an efficient method for more comprehensive characterization of SMCs. Improved SMC characterization, facilitated by array CGH, will allow for more accurate SMC/phenotype correlation. PMID:18471320

  1. Comparative analysis of the structure of carbon materials relevant in combustion.

    PubMed

    Apicella, B; Barbella, R; Ciajolo, A; Tregrossi, A

    2003-06-01

    The determination of the structure of carbon materials is an analytical problem that join the research scientific communities involved in the chemical characterization of heavy fuel-derived products (heavy fuel oils, coal-derived fuels, shale oil, etc.) and of carbon materials (polycyclic aromatic compounds, tar, soot) produced in many combustion processes. The knowledge of the structure of these "difficult" fuels and of the carbon materials produced by incomplete combustion is relevant to research for the best low-environmental impact operation of combustion systems; but an array of many analytical and spectroscopic tools are necessary, and often not sufficient, to attempt the characterization of such complex products and in particular to determine the distribution of molecular masses. In this paper the size exclusion chromatography using N-methyl-pyrrolidinone as eluent has been applied for the characterization of different carbon materials starting from typical carbon species, commercially available like polyacenaphthylene, carbon black, naphthalene pitch up to combustion products like soot and soot extract collected in fuel-rich combustion systems. Two main fractions were detected, separated and molecular weights (MWs) determined by comparison with polystyrene standards: a first fraction consisted of particles with very large molecular masses (>100000 u); a second fraction consisted of species in a relatively small MW range (200-600 u). The distribution of these fractions changes in dependence on the carbon sample characteristics. Fluorescence spectroscopy applied on the fractions separated by size-exclusion chromatography has been used and comparatively interpreted giving indications on the differences and similarities in chemical structure of such different materials.

  2. A Comparative Study of Vertebrate Corneal Structure: The Evolution of a Refractive Lens

    PubMed Central

    Winkler, Moritz; Shoa, Golroxan; Tran, Stephanie T.; Xie, Yilu; Thomasy, Sarah; Raghunathan, Vijay K.; Murphy, Christopher; Brown, Donald J.; Jester, James V.

    2015-01-01

    Purpose. Although corneal curvature plays an important role in determining the refractive power of the vertebrate eye, the mechanisms controlling corneal shape remain largely unknown. To address this question, we performed a comparative study of vertebrate corneal structure to identify potential evolutionarily based changes that correlate with the development of a corneal refractive lens. Methods. Nonlinear optical (NLO) imaging of second-harmonic–generated (SHG) signals was used to image collagen and three-dimensionally reconstruct the lamellar organization in corneas from different vertebrate clades. Results. Second-harmonic–generated images taken normal to the corneal surface showed that corneal collagen in all nonmammalian vertebrates was organized into sheets (fish and amphibians) or ribbons (reptiles and birds) extending from limbus to limbus that were oriented nearly orthogonal (ranging from 77.7°–88.2°) to their neighbors. The slight angular offset (2°–13°) created a rotational pattern that continued throughout the full thickness in fish and amphibians and to the very posterior layers in reptiles and birds. Interactions between lamellae were limited to “sutural” fibers in cartilaginous fish, and occasional lamellar branching in fish and amphibians. There was a marked increase in lamellar branching in higher vertebrates, such that birds ≫ reptiles > amphibians > fish. By contrast, mammalian corneas showed a nearly random collagen fiber organization with no orthogonal, chiral pattern. Conclusions. Our data indicate that nonmammalian vertebrate corneas share a common orthogonal collagen structural organization that shows increased lamellar branching in higher vertebrate species. Importantly, mammalian corneas showed a different structural organization, suggesting a divergent evolutionary background. PMID:26066606

  3. Mathematical Approach to Bio-Informatics

    NASA Astrophysics Data System (ADS)

    Sato, Keiko; Ohya, Masanori

    2008-03-01

    In research for life we first need to align the sequences in order to compare several different genes or amino acid sequences. When the number of sequences being compared becomes too large, such alignment takes a very long time. Therefore, we have made an attempt to establish this alignment using quantum algorithms (e.g.,[5,6]). We discuss one of such algorithms here. In future, we plan to use our findings in research on classification and change in living organisms such as HIV, and to link it to the introduction of markers for observing changes in disease progression (see [16-22] for trials along this line). We in this paper explain some of our trials by means of coding theory and entropic chaos degree.

  4. The Roots of Bioinformatics in Theoretical Biology

    PubMed Central

    Hogeweg, Paulien

    2011-01-01

    From the late 1980s onward, the term “bioinformatics” mostly has been used to refer to computational methods for comparative analysis of genome data. However, the term was originally more widely defined as the study of informatic processes in biotic systems. In this essay, I will trace this early history (from a personal point of view) and I will argue that the original meaning of the term is re-emerging. PMID:21483479

  5. A new structure for comparing surface passivation materials of GaAs solar cells

    NASA Technical Reports Server (NTRS)

    Desalvo, Gregory C.; Barnett, Allen M.

    1989-01-01

    The surface recombination velocity (S sub rec) for bare GaAs is typically as high as 10 to the 6th power to 10 to the 7th power cm/sec, which dramatically lowers the efficiency of GaAs solar cells. Early attempts to circumvent this problem by making an ultra thin junction (xj less than .1 micron) proved unsuccessful when compared to lowering S sub rec by surface passivation. Present day GaAs solar cells use an GaAlAs window layer to passivate the top surface. The advantages of GaAlAs in surface passivation are its high bandgap energy and lattice matching to GaAs. Although GaAlAs is successful in reducing the surface recombination velocity, it has other inherent problems of chemical instability (Al readily oxidizes) and ohmic contact formation. The search for new, more stable window layer materials requires a means to compare their surface passivation ability. Therefore, a device structure is needed to easily test the performance of different passivating candidates. Such a test device is described.

  6. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation.

    PubMed

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-06-20

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  7. Comparative assessment of students’ performance and perceptions on objective structured practical models in undergraduate pathology teaching

    PubMed Central

    Htwe, Than Than; Ismail, Sabaridah Binti; Low, Gary Kim Kuan

    2014-01-01

    INTRODUCTION Assessment is an important factor that drives student learning, as students tend to mainly focus on the material to be assessed. The current practice in teaching pathology extensively applies objective-structured practical examination for the assessment of students. As students will have to deal with real patients during clinical years, it is preferred that students learn and practise via potted specimens and slides instead of picture plates. This study aimed to assess the preferred assesment method of pathology practical exercises. METHODS This was a cross-sectional survey carried out in two consecutive batches of Phase 2 medical students. Student competency was assessed using both the traditional (TD) (i.e. use of potted specimens and slides) and picture plate (PP) methods. To compare the two assessment methods, we compared the mean scores obtained by the students and examined student perception of the two methods. RESULTS The mean scores obtained via the PP method were significantly higher than those obtained via the TD method for almost all the components tested. CONCLUSION We found that students performed significantly better (p < 0.05) when assessed using the PP method instead of the TD method. PP preparations might provide better visuals, thus aiding understanding, than the TD method. The findings of this study are valuable in identifying and improving our current teaching and assessment methods of medical students, in line with advancements in information technology. PMID:25273936

  8. [Comparative Study on the Molecular Structures and Spectral Properties of Ponceau 4R and Amaranth].

    PubMed

    Zhang, Yong; Chen, Guo-qing; Zhu, Chun; Hu, Yang-jun

    2015-11-01

    The Edinburgh FLS920P steady-instantaneous fluorescence spectrometer was applied on the detection of the absorption and the emission spectra of ponceau 4R and amaranth, which are isomers to each other. After that, the spectral parameters of them were compared. Then, the density functional theory (DFT) and time-dependent density functional theory (TD-DFT) were used on the optimization of ponceau 4R and amaranth under the ground and excited state, respectively, in order to compare the differences in configurations of them under different states. On the base of the results above, the absorption and emission spectra of the two isomers were calculated with TD-DFT, and the polarized continuum model (PCM) was applied on the base of 6-311++G (d, p). The fluorescence mechanism, the relationships between the properties of fluorescence spectra and the molecular geometry were all analyzed. The results shows that, the structures of the two molecules are non-planar, these two naphthalene rings are not co-planar, respectively, and there's hydrogen bond in amaranth. When the two isomers were on the ground state, the planarity of the naphthalene ring which exists the hydrogen bond mentioned above in amaranth is better than the corresponding part of ponceau 4R. The two isomers are nearly co-planar when they're on the excited state. The molecular structures of ponceau 4R and amaranth optimized above are basically reasonable, for the quantum chemistry calculation spectral results are agree with the experiments. The planarity of the naphthalene rings on the right side in ponceau 4R is worse than that in amaranth, the ponceau 4R molecule experienced more vibration and rotation from the excited to the ground state, lost more energy, which lead to the reduction of energy for emitting fluorescent photons. So ponceau 4R has longer fluorescence emission wave- length than amaranth. In this paper, the molecular structure information of ponceau 4R and amaranth were obtained, and the differences

  9. [Comparative Study on the Molecular Structures and Spectral Properties of Ponceau 4R and Amaranth].

    PubMed

    Zhang, Yong; Chen, Guo-qing; Zhu, Chun; Hu, Yang-jun

    2015-11-01

    The Edinburgh FLS920P steady-instantaneous fluorescence spectrometer was applied on the detection of the absorption and the emission spectra of ponceau 4R and amaranth, which are isomers to each other. After that, the spectral parameters of them were compared. Then, the density functional theory (DFT) and time-dependent density functional theory (TD-DFT) were used on the optimization of ponceau 4R and amaranth under the ground and excited state, respectively, in order to compare the differences in configurations of them under different states. On the base of the results above, the absorption and emission spectra of the two isomers were calculated with TD-DFT, and the polarized continuum model (PCM) was applied on the base of 6-311++G (d, p). The fluorescence mechanism, the relationships between the properties of fluorescence spectra and the molecular geometry were all analyzed. The results shows that, the structures of the two molecules are non-planar, these two naphthalene rings are not co-planar, respectively, and there's hydrogen bond in amaranth. When the two isomers were on the ground state, the planarity of the naphthalene ring which exists the hydrogen bond mentioned above in amaranth is better than the corresponding part of ponceau 4R. The two isomers are nearly co-planar when they're on the excited state. The molecular structures of ponceau 4R and amaranth optimized above are basically reasonable, for the quantum chemistry calculation spectral results are agree with the experiments. The planarity of the naphthalene rings on the right side in ponceau 4R is worse than that in amaranth, the ponceau 4R molecule experienced more vibration and rotation from the excited to the ground state, lost more energy, which lead to the reduction of energy for emitting fluorescent photons. So ponceau 4R has longer fluorescence emission wave- length than amaranth. In this paper, the molecular structure information of ponceau 4R and amaranth were obtained, and the differences

  10. STRUCTURES OF LOCAL GALAXIES COMPARED TO HIGH-REDSHIFT STAR-FORMING GALAXIES

    SciTech Connect

    Petty, Sara M.; De Mello, DuIlia F.; Gallagher, John S.; Gardner, Jonathan P.; Lotz, Jennifer M.; Matt Mountain, C.; Smith, Linda J.

    2009-08-15

    The rest-frame far-ultraviolet morphologies of eight nearby interacting and starburst galaxies (Arp 269, M 82, Mrk 8, NGC 520, NGC 1068, NGC 3079, NGC 3310, and NGC 7673) are compared with 54 galaxies at z {approx} 1.5 and 46 galaxies at z {approx} 4 observed in the Great Observatories Origins Deep Survey (GOODS) taken with the Advanced Camera for Surveys onboard the Hubble Space Telescope. The nearby sample is artificially redshifted to z {approx} 1.5 and 4 by applying luminosity and size scaling. We compare the simulated galaxy morphologies to real z {approx} 1.5 and 4 UV-bright galaxy morphologies. We calculate the Gini coefficient (G), the second-order moment of the brightest 20% of the galaxy's flux (M {sub 20}), and the Sersic index (n). We explore the use of nonparametric methods with two-dimensional profile fitting and find the combination of M {sub 20} with n an efficient method to classify galaxies as having merger, exponential disk, or bulge-like morphologies. When classified according to G and M {sub 20} 20/30% of real/simulated galaxies at z {approx} 1.5 and 37/12% at z {approx} 4 have bulge-like morphologies. The rest have merger-like or intermediate distributions. Alternatively, when classified according to the Sersic index, 70% of the z {approx} 1.5 and z {approx} 4 real galaxies are exponential disks or bulge-like with n>0.8, and {approx} 30% of the real galaxies are classified as mergers. The artificially redshifted galaxies have n values with {approx} 35% bulge or exponential at z {approx} 1.5 and 4. Therefore, {approx} 20%-30% of Lyman-break galaxies have structures similar to local starburst mergers, and may be driven by similar processes. We assume merger-like or clumpy star-forming galaxies in the GOODS field have morphological structure with values n < 0.8 and M {sub 20}> - 1.7. We conclude that Mrk 8, NGC 3079, and NGC 7673 have structures similar to those of merger-like and clumpy star-forming galaxies observed at z {approx} 1.5 and 4.

  11. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    PubMed

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M; Shan, Xueyan; Peterson, Daniel G; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M; Isberg, Sally R; Higgins, Damien P; Chong, Amanda Y; John, John St; Glenn, Travis C; Ray, David A; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs.

  12. Comparative Genome Analyses Reveal Distinct Structure in the Saltwater Crocodile MHC

    PubMed Central

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M.; Shan, Xueyan; Peterson, Daniel G.; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M.; Isberg, Sally R.; Higgins, Damien P.; Chong, Amanda Y.; John, John St; Glenn, Travis C.; Ray, David A.; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2–6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  13. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics.

    PubMed

    de Matos, Paula; Cham, Jennifer A; Cao, Hong; Alcántara, Rafael; Rowland, Francis; Lopez, Rodrigo; Steinbeck, Christoph

    2013-01-01

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users' requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios.For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature.We employed several UCD techniques, including: persona development, interviews, 'canvas sort' card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience. PMID:23514033

  14. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics

    PubMed Central

    2013-01-01

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users’ requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios. For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature. We employed several UCD techniques, including: persona development, interviews, ‘canvas sort’ card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience. PMID:23514033

  15. [Bioinformatics in Cancer Clinical Sequencing -- An Emerging Field of Cancer Personalized Medicine].

    PubMed

    Kato, Mamoru

    2016-04-01

    Thus far, bioinformatics has mostly been applied in basic science research. It was initially used to analyze protein sequences in unicellular organisms, aiding discoveries in basic biology. Following the completion of human genome sequencing, it has also facilitated numerous discoveries in basic medicine. Recently, several clinical applications of bioinformatics have been reported. Most relevantly, bioinformatics has been applied to clinical sequencing - an emerging field of personalized medicine, or precision medicine. In this review, I will introduce basic techniques of bioinformatics used in clinical sequencing, avoiding excessive technical details. I will also discuss future directions for data analysis using bioinformatics in the field of personalized medicine.

  16. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    NASA Astrophysics Data System (ADS)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  17. Bioinformatics for precision medicine in oncology: principles and application to the SHIVA clinical trial

    PubMed Central

    Servant, Nicolas; Roméjon, Julien; Gestraud, Pierre; La Rosa, Philippe; Lucotte, Georges; Lair, Séverine; Bernard, Virginie; Zeitouni, Bruno; Coffin, Fanny; Jules-Clément, Gérôme; Yvon, Florent; Lermine, Alban; Poullet, Patrick; Liva, Stéphane; Pook, Stuart; Popova, Tatiana; Barette, Camille; Prud’homme, François; Dick, Jean-Gabriel; Kamal, Maud; Le Tourneau, Christophe; Barillot, Emmanuel; Hupé, Philippe

    2014-01-01

    Precision medicine (PM) requires the delivery of individually adapted medical care based on the genetic characteristics of each patient and his/her tumor. The last decade witnessed the development of high-throughput technologies such as microarrays and next-generation sequencing which paved the way to PM in the field of oncology. While the cost of these technologies decreases, we are facing an exponential increase in the amount of data produced. Our ability to use this information in daily practice relies strongly on the availability of an efficient bioinformatics system that assists in the translation of knowledge from the bench towards molecular targeting and diagnosis. Clinical trials and routine diagnoses constitute different approaches, both requiring a strong bioinformatics environment capable of (i) warranting the integration and the traceability of data, (ii) ensuring the correct processing and analyses of genomic data, and (iii) applying well-defined and reproducible procedures for workflow management and decision-making. To address the issues, a seamless information system was developed at Institut Curie which facilitates the data integration and tracks in real-time the processing of individual samples. Moreover, computational pipelines were developed to identify reliably genomic alterations and mutations from the molecular profiles of each patient. After a rigorous quality control, a meaningful report is delivered to the clinicians and biologists for the therapeutic decision. The complete bioinformatics environment and the key points of its implementation are presented in the context of the SHIVA clinical trial, a multicentric randomized phase II trial comparing targeted therapy based on tumor molecular profiling versus conventional therapy in patients with refractory cancer. The numerous challenges faced in practice during the setting up and the conduct of this trial are discussed as an illustration of PM application. PMID:24910641

  18. Rise and Demise of Bioinformatics? Promise and Progress

    PubMed Central

    Ouzounis, Christos A.

    2012-01-01

    The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself. PMID:22570600

  19. A survey on evolutionary algorithm based hybrid intelligence in bioinformatics.

    PubMed

    Li, Shan; Kang, Liying; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks.

  20. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  1. A review of estimation of distribution algorithms in bioinformatics

    PubMed Central

    Armañanzas, Rubén; Inza, Iñaki; Santana, Roberto; Saeys, Yvan; Flores, Jose Luis; Lozano, Jose Antonio; Peer, Yves Van de; Blanco, Rosa; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro

    2008-01-01

    Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. PMID:18822112

  2. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  3. Meeting Review: 2002 O'Reilly Bioinformatics Technology Conference

    PubMed Central

    2002-01-01

    At the end of January I travelled to the States to speak at and attend the first O’Reilly Bioinformatics Technology Conference [14]. It was a large, well-organized and diverse meeting with an interesting history. Although the meeting was not a typical academic conference, its style will, I am sure, become more typical of meetings in both biological and computational sciences. Speakers at the event included prominent bioinformatics researchers such as Ewan Birney, Terry Gaasterland and Lincoln Stein; authors and leaders in the open source programming community like Damian Conway and Nat Torkington; and representatives from several publishing companies including the Nature Publishing Group, Current Science Group and the President of O’Reilly himself, Tim O’Reilly. There were presentations, tutorials, debates, quizzes and even a ‘jam session’ for musical bioinformaticists. PMID:18628852

  4. Bioinformatics tools for small genomes, such as hepatitis B virus.

    PubMed

    Bell, Trevor G; Kramvis, Anna

    2015-02-01

    DNA sequence analysis is undertaken in many biological research laboratories. The workflow consists of several steps involving the bioinformatic processing of biological data. We have developed a suite of web-based online bioinformatic tools to assist with processing, analysis and curation of DNA sequence data. Most of these tools are genome-agnostic, with two tools specifically designed for hepatitis B virus sequence data. Tools in the suite are able to process sequence data from Sanger sequencing, ultra-deep amplicon resequencing (pyrosequencing) and chromatograph (trace files), as appropriate. The tools are available online at no cost and are aimed at researchers without specialist technical computer knowledge. The tools can be accessed at http://hvdr.bioinf.wits.ac.za/SmallGenomeTools, and the source code is available online at https://github.com/DrTrevorBell/SmallGenomeTools. PMID:25690798

  5. Personalized medicine: challenges and opportunities for translational bioinformatics

    PubMed Central

    Overby, Casey Lynnette; Tarczy-Hornoch, Peter

    2013-01-01

    Personalized medicine can be defined broadly as a model of healthcare that is predictive, personalized, preventive and participatory. Two US President’s Council of Advisors on Science and Technology reports illustrate challenges in personalized medicine (in a 2008 report) and in use of health information technology (in a 2010 report). Translational bioinformatics is a field that can help address these challenges and is defined by the American Medical Informatics Association as “the development of storage, analytic and interpretive methods to optimize the transformation of increasing voluminous biomedical data into proactive, predictive, preventative and participatory health.” This article discusses barriers to implementing genomics applications and current progress toward overcoming barriers, describes lessons learned from early experiences of institutions engaged in personalized medicine and provides example areas for translational bioinformatics research inquiry. PMID:24039624

  6. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    PubMed Central

    Li, Shan; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks. PMID:24729969

  7. Comparative study of normal and branched alkane monolayer films adsorbed on a solid surface. I. Structure

    NASA Astrophysics Data System (ADS)

    Enevoldsen, A. D.; Hansen, F. Y.; Diama, A.; Criswell, L.; Taub, H.

    2007-03-01

    The structure of a monolayer film of the branched alkane squalane (C30H62) adsorbed on graphite has been studied by neutron diffraction and molecular dynamics (MD) simulations and compared with a similar study of the n-alkane tetracosane (n-C24H52). Both molecules have 24 carbon atoms along their backbone and squalane has, in addition, six methyl side groups. Upon adsorption, there are significant differences as well as similarities in the behavior of these molecular films. Both molecules form ordered structures at low temperatures; however, while the melting point of the two-dimensional (2D) tetracosane film is roughly the same as the bulk melting point, the surface strongly stabilizes the 2D squalane film such that its melting point is 91K above its value in bulk. Therefore, squalane, like tetracosane, will be a poor lubricant in those nanoscale devices that require a fluid lubricant at room temperature. The neutron diffraction data show that the translational order in the squalane monolayer is significantly less than in the tetracosane monolayer. The authors' MD simulations suggest that this is caused by a distortion of the squalane molecules upon adsorption on the graphite surface. When the molecules are allowed to relax on the surface, they distort such that all six methyl groups point away from the surface. This results in a reduction in the monolayer's translational order characterized by a decrease in its coherence length and hence a broadening of the diffraction peaks. The MD simulations also show that the melting mechanism in the squalane monolayer is the same footprint reduction mechanism found in the tetracosane monolayer, where a chain melting drives the lattice melting.

  8. Comparative study of normal and branched alkane monolayer films adsorbed on a solid surface. I. Structure.

    PubMed

    Enevoldsen, A D; Hansen, F Y; Diama, A; Criswell, L; Taub, H

    2007-03-14

    The structure of a monolayer film of the branched alkane squalane (C30H62) adsorbed on graphite has been studied by neutron diffraction and molecular dynamics (MD) simulations and compared with a similar study of the n-alkane tetracosane (n-C24H52). Both molecules have 24 carbon atoms along their backbone and squalane has, in addition, six methyl side groups. Upon adsorption, there are significant differences as well as similarities in the behavior of these molecular films. Both molecules form ordered structures at low temperatures; however, while the melting point of the two-dimensional (2D) tetracosane film is roughly the same as the bulk melting point, the surface strongly stabilizes the 2D squalane film such that its melting point is 91 K above its value in bulk. Therefore, squalane, like tetracosane, will be a poor lubricant in those nanoscale devices that require a fluid lubricant at room temperature. The neutron diffraction data show that the translational order in the squalane monolayer is significantly less than in the tetracosane monolayer. The authors' MD simulations suggest that this is caused by a distortion of the squalane molecules upon adsorption on the graphite surface. When the molecules are allowed to relax on the surface, they distort such that all six methyl groups point away from the surface. This results in a reduction in the monolayer's translational order characterized by a decrease in its coherence length and hence a broadening of the diffraction peaks. The MD simulations also show that the melting mechanism in the squalane monolayer is the same footprint reduction mechanism found in the tetracosane monolayer, where a chain melting drives the lattice melting.

  9. Unconventional Thin-Film Thermoelectric Converters: Structure, Simulation, and Comparative Study

    NASA Astrophysics Data System (ADS)

    Haras, Maciej; Lacatena, Valeria; Monfray, Stéphane; Robillard, Jean-François; Skotnicki, Thomas; Dubois, Emmanuel

    2014-06-01

    Bi2Te3 or Sb2Te3 are the materials most widely used in thermoelectric generators (TEG) operating near room temperature. These materials are, however, environmentally harmful, expensive, and incompatible with complementary metal-oxide semiconductor technology, in contrast to silicon (Si), germanium (Ge), or silicon-germanium (SiGe). Although the thermopower ( S) and electrical conductivity ( σ) of Si and Ge are high, use in thermoelectricity is severely hindered by their high thermal conductivity ( κ). By altering the phonon band structure of this Si films by use of an artificial phononic pattern, spectacular reduction of κ by two orders of magnitude has been demonstrated. To take full advantage of phonon band modification and scattering in thin films, converter structure based on thin-film membranes is proposed for κ reduction. To consolidate the position of Si-based materials, coupled charge and heat-transport simulations have been conducted to demonstrate the potential of the materials for thermoelectric conversion compared with such widespread materials as Bi2Te3. The effect of contact resistance on generator performance has been carefully taken into consideration to reflect integration constraints at the TEG level. For a temperature difference Δ T = 30 K, the maximum electrical power density reaches approximately 6 W/cm2 for Si and Ge, and approximately 3 W/cm2 for Si0.7Ge0.3, values which are similar to those for Bi2Te3. Finally, it is emphasized that the proposed approach is compatible with conventional Si technology and naturally provides augmented mechanical flexibility that substantially widens the field of application of thermal harvesting.

  10. A comparative study of MOEM pressure sensors using MZI, DC, and racetrack resonator IO structures

    NASA Astrophysics Data System (ADS)

    Selvarajan, A.; Pattnaik, Prasant Kumar; Badrinarayana, T.; Srinivas, T.

    2006-03-01

    In recent years micro-electro-mechanical system (MEMS) sensors have drawn considerable attention due to their attraction in terms of miniaturization, batch fabrication and ease of integration with the required electronics circuitry. Micro-opto-electro-mechanical (MOEM) devices and systems, based on the principles of integrated optics and micromachining technology on silicon have immense potential for sensor applications. Employing optical techniques have important advantages such as functionality, large bandwidth and higher sensitivity. Pressure sensing is currently the most lucrative market for solid-state micro sensors. Pressure sensing using micromachined structures utilize the changes induced in either the resistive or capacitive properties of the electro-mechanical structure by the impressed pressure. Integrated optical pressure sensors can utilize the changes to the amplitude, phase, refractive index profile, optical path length, or polarization of the lightwave by the external pressure. In this paper we compare the performance characteristics of three types of MOEM pressure sensors based on Mach-Zehnder Interferometer (MZI), Directional Coupler (DC) and racetrack resonator (RR) integrated optical geometries. The first two configurations measure the pressure changes through a change in optical intensity while the third one measures the same in terms of frequency or wavelength change. The analysis of each sensors has been carried out in terms of mechanical and optical models and their interrelationship through optomechanical coupling. For a typical diaphragm of size 2mm × 1mm × 20 μm, normalized pressure sensitivity of 18.35 μW/mW/kPa, 29.37 μW/mW/kPa and 2.26 pm/kPa in case of MZI, DC and RR devices have been obtained respectively. The noise performance of these devices are also presented.

  11. Integration of bioinformatics into an undergraduate biology curriculum and the impact on development of mathematical skills.

    PubMed

    Wightman, Bruce; Hark, Amy T

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative.

  12. Comparative electronic structure of a lanthanide and actinide diatomic oxide: Nd versus U

    NASA Astrophysics Data System (ADS)

    Krauss, M.; Stevens, W. J.

    2003-01-01

    Using a modified version of the Alchemy electronic structure code and relativistic pseudopotentials, the electronic structure of the ground and low lying excited states of UO, NdO, and NdO + have been calculated at the Hartree-Fock (HF) and multiconfiguration self-consistent field (MCSCF) levels of theory. Including results from an earlier study of UO + this provides the information for a comparative analysis of a lanthanide and an actinide diatomic oxide. UO and NdO are both described formally as M +2 O -2 and the cations as M +3 O -2 , but the HF and MCSCF calculations show that these systems are considerably less ionic due to large charge back-transfer in the πorbitals. The electronic states putatively arise from the ligand field (oxygen anion) perturbed f 4 , sf 3 , df 3 , sdf 2 , or s 2 f 2 states of M +2 and f 3 , sf 2 or df 2 states of M +3 . Molecular orbital results show a substantial stabilization of the sf 3 or s 2 f 2 configurations relative to the f 4 or df 3 configurations that are the even or odd parity ground states in the M +2 free ion. The compact f and d orbitals are more destabilized by the anion field than the diffuse s orbital. The ground states of the neutral species are dominated by orbitals arising from the M +2 sf 3 term, and all the potential energy curves arising from this configuration are similar, which allows an estimate of the vibrational frequencies for UO and NdO of 862 cm -1 and 836 cm -1 , respectively. For NdO + and UO + the excitation energies for the Ωstates were calculated with a valence configuration interaction method using ab initio effective spin-orbit operators to couple the molecular orbital configurations. The results for NdO + are very comparable with the results for UO + , and show the vibrational and electronic states to be interleaved.

  13. The natural product domain seeker NaPDoS: a phylogeny based bioinformatic tool to classify secondary metabolite gene diversity.

    PubMed

    Ziemert, Nadine; Podell, Sheila; Penn, Kevin; Badger, Jonathan H; Allen, Eric; Jensen, Paul R

    2012-01-01

    New bioinformatic tools are needed to analyze the growing volume of DNA sequence data. This is especially true in the case of secondary metabolite biosynthesis, where the highly repetitive nature of the associated genes creates major challenges for accurate sequence assembly and analysis. Here we introduce the web tool Natural Product Domain Seeker (NaPDoS), which provides an automated method to assess the secondary metabolite biosynthetic gene diversity and novelty of strains or environments. NaPDoS analyses are based on the phylogenetic relationships of sequence tags derived from polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) genes, respectively. The sequence tags correspond to PKS-derived ketosynthase domains and NRPS-derived condensation domains and are compared to an internal database of experimentally characterized biosynthetic genes. NaPDoS provides a rapid mechanism to extract and classify ketosynthase and condensation domains from PCR products, genomes, and metagenomic datasets. Close database matches provide a mechanism to infer the generalized structures of secondary metabolites while new phylogenetic lineages provide targets for the discovery of new enzyme architectures or mechanisms of secondary metabolite assembly. Here we outline the main features of NaPDoS and test it on four draft genome sequences and two metagenomic datasets. The results provide a rapid method to assess secondary metabolite biosynthetic gene diversity and richness in organisms or environments and a mechanism to identify genes that may be associated with uncharacterized biochemistry.

  14. The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity

    PubMed Central

    Ziemert, Nadine; Podell, Sheila; Penn, Kevin; Badger, Jonathan H.; Allen, Eric; Jensen, Paul R.

    2012-01-01

    New bioinformatic tools are needed to analyze the growing volume of DNA sequence data. This is especially true in the case of secondary metabolite biosynthesis, where the highly repetitive nature of the associated genes creates major challenges for accurate sequence assembly and analysis. Here we introduce the web tool Natural Product Domain Seeker (NaPDoS), which provides an automated method to assess the secondary metabolite biosynthetic gene diversity and novelty of strains or environments. NaPDoS analyses are based on the phylogenetic relationships of sequence tags derived from polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) genes, respectively. The sequence tags correspond to PKS-derived ketosynthase domains and NRPS-derived condensation domains and are compared to an internal database of experimentally characterized biosynthetic genes. NaPDoS provides a rapid mechanism to extract and classify ketosynthase and condensation domains from PCR products, genomes, and metagenomic datasets. Close database matches provide a mechanism to infer the generalized structures of secondary metabolites while new phylogenetic lineages provide targets for the discovery of new enzyme architectures or mechanisms of secondary metabolite assembly. Here we outline the main features of NaPDoS and test it on four draft genome sequences and two metagenomic datasets. The results provide a rapid method to assess secondary metabolite biosynthetic gene diversity and richness in organisms or environments and a mechanism to identify genes that may be associated with uncharacterized biochemistry. PMID:22479523

  15. A Quick Guide for Building a Successful Bioinformatics Community

    PubMed Central

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-01-01

    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  16. A quick guide for building a successful bioinformatics community.

    PubMed

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-02-01

    "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  17. Comparative analysis of microbiome measurement platforms using latent variable structural equation modeling

    PubMed Central

    2013-01-01

    Background Culture-independent phylogenetic analysis of 16S ribosomal RNA (rRNA) gene sequences has emerged as an incisive method of profiling bacteria present in a specimen. Currently, multiple techniques are available to enumerate the abundance of bacterial taxa in specimens, including the Sanger sequencing, the ‘next generation’ pyrosequencing, microarrays, quantitative PCR, and the rapidly emerging, third generation sequencing, and fourth generation sequencing methods. An efficient statistical tool is in urgent need for the followings tasks: (1) to compare the agreement between these measurement platforms, (2) to select the most reliable platform(s), and (3) to combine different platforms of complementary strengths, for a unified analysis. Results We present the latent variable structural equation modeling (SEM) as a novel statistical application for the comparative analysis of measurement platforms. The latent variable SEM model treats the true (unknown) relative frequency of a given bacterial taxon in a specimen as the latent (unobserved) variable and estimates the reliabilities of, and similarities between, different measurement platforms, and subsequently weighs those measurements optimally for a unified analysis of the microbiome composition. The latent variable SEM contains the repeated measures ANOVA (both the univariate and the multivariate models) as special cases and, as a more general and realistic modeling approach, yields superior goodness-of-fit and more reliable analysis results, as demonstrated by a microbiome study of the human inflammatory bowel diseases. Conclusions Given the rapid evolution of modern biotechnologies, the measurement platform comparison, selection and combination tasks are here to stay and to grow – and the latent variable SEM method is readily applicable to any other biological settings, aside from the microbiome study presented here. PMID:23497007

  18. [A review on the bioinformatics pipelines for metagenomic research].

    PubMed

    Ye, Dan-Dan; Fan, Meng-Meng; Guan, Qiong; Chen, Hong-Ju; Ma, Zhan-Shan

    2012-12-01

    Metagenome, a term first dubbed by Handelsman in 1998 as "the genomes of the total microbiota found in nature", refers to sequence data directly sampled from the environment (which may be any habitat in which microbes live, such as the guts of humans and animals, milk, soil, lakes, glaciers, and oceans). Metagenomic technologies originated from environmental microbiology studies and their wide application has been greatly facilitated by next-generation high throughput sequencing technologies. Like genomics studies, the bottle neck of metagenomic research is how to effectively and efficiently analyze the gigantic amount of metagenomic sequence data using the bioinformatics pipelines to obtain meaningful biological insights. In this article, we briefly review the state-of-the-art bioinformatics software tools in metagenomic research. Due to the differences between the metagenomic data obtained from whole genome sequencing (i.e., shotgun metagenomics) and amplicon sequencing (i.e., 16S-rRNA and gene-targeted metagenomics) methods, there are significant differences between the corresponding bioinformatics tools for these data; accordingly, we review the computational pipelines separately for these two types of data. PMID:23266976

  19. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    PubMed Central

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  20. Best practices in bioinformatics training for life scientists.

    PubMed

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301