Science.gov

Sample records for comparative structural bioinformatics

  1. A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Family Ectodomain Based on Phylogenetic Information

    PubMed Central

    Rentería, Miguel E.; Gandhi, Neha S.; Vinuesa, Pablo; Helmerhorst, Erik; Mancera, Ricardo L.

    2008-01-01

    The insulin receptor (IR), the insulin-like growth factor 1 receptor (IGF1R) and the insulin receptor-related receptor (IRR) are covalently-linked homodimers made up of several structural domains. The molecular mechanism of ligand binding to the ectodomain of these receptors and the resulting activation of their tyrosine kinase domain is still not well understood. We have carried out an amino acid residue conservation analysis in order to reconstruct the phylogeny of the IR Family. We have confirmed the location of ligand binding site 1 of the IGF1R and IR. Importantly, we have also predicted the likely location of the insulin binding site 2 on the surface of the fibronectin type III domains of the IR. An evolutionary conserved surface on the second leucine-rich domain that may interact with the ligand could not be detected. We suggest a possible mechanical trigger of the activation of the IR that involves a slight ‘twist’ rotation of the last two fibronectin type III domains in order to face the likely location of insulin. Finally, a strong selective pressure was found amongst the IRR orthologous sequences, suggesting that this orphan receptor has a yet unknown physiological role which may be conserved from amphibians to mammals. PMID:18989367

  2. Structural bioinformatics of the human spliceosomal proteome

    PubMed Central

    Korneta, Iga; Magnus, Marcin; Bujnicki, Janusz M.

    2012-01-01

    In this work, we describe the results of a comprehensive structural bioinformatics analysis of the spliceosomal proteome. We used fold recognition analysis to complement prior data on the ordered domains of 252 human splicing proteins. Examples of newly identified domains include a PWI domain in the U5 snRNP protein 200K (hBrr2, residues 258–338), while examples of previously known domains with a newly determined fold include the DUF1115 domain of the U4/U6 di-snRNP protein 90K (hPrp3, residues 540–683). We also established a non-redundant set of experimental models of spliceosomal proteins, as well as constructed in silico models for regions without an experimental structure. The combined set of structural models is available for download. Altogether, over 90% of the ordered regions of the spliceosomal proteome can be represented structurally with a high degree of confidence. We analyzed the reduced spliceosomal proteome of the intron-poor organism Giardia lamblia, and as a result, we proposed a candidate set of ordered structural regions necessary for a functional spliceosome. The results of this work will aid experimental and structural analyses of the spliceosomal proteins and complexes, and can serve as a starting point for multiscale modeling of the structure of the entire spliceosome. PMID:22573172

  3. NMR structure improvement: A structural bioinformatics & visualization approach

    NASA Astrophysics Data System (ADS)

    Block, Jeremy N.

    The overall goal of this project is to enhance the physical accuracy of individual models in macromolecular NMR (Nuclear Magnetic Resonance) structures and the realism of variation within NMR ensembles of models, while improving agreement with the experimental data. A secondary overall goal is to combine synergistically the best aspects of NMR and crystallographic methodologies to better illuminate the underlying joint molecular reality. This is accomplished by using the powerful method of all-atom contact analysis (describing detailed sterics between atoms, including hydrogens); new graphical representations and interactive tools in 3D and virtual reality; and structural bioinformatics approaches to the expanded and enhanced data now available. The resulting better descriptions of macromolecular structure and its dynamic variation enhances the effectiveness of the many biomedical applications that depend on detailed molecular structure, such as mutational analysis, homology modeling, molecular simulations, protein design, and drug design.

  4. Bioinformatic Analysis of Toll-Like Receptor Sequences and Structures.

    PubMed

    Monie, Tom P; Gay, Nicholas J; Gangloff, Monique

    2016-01-01

    Continual advancements in computing power and sophistication, coupled with rapid increases in protein sequence and structural information, have made bioinformatic tools an invaluable resource for the molecular and structural biologist. With the degree of sequence information continuing to expand at an almost exponential rate, it is essential that scientists today have a basic understanding of how to utilise, manipulate and analyse this information for the benefit of their own experiments. In the context of Toll-Interleukin I Receptor domain containing proteins, we describe here a series of the more common and user-friendly bioinformatic tools available as Internet-based resources. These will enable the identification and alignment of protein sequences; the identification of functional motifs; the characterisation of protein secondary structure; the identification of protein structural folds and distantly homologous proteins; and the validation of the structural geometry of modelled protein structures. PMID:26803620

  5. Teaching Structural Bioinformatics at the Undergraduate Level

    ERIC Educational Resources Information Center

    Centeno, Nuria B.; Villa-Freixa, Jordi; Oliva, Baldomero

    2003-01-01

    Understanding the basic principles of structural biology is becoming a major subject of study in most undergraduate level programs in biology. In the genomic and proteomic age, it is becoming indispensable for biology students to master concepts related to the sequence and structure of proteins in order to develop skills that may be useful in a…

  6. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  7. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    ERIC Educational Resources Information Center

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  8. Comparative modeling of proteins: a method for engaging students' interest in bioinformatics tools.

    PubMed

    Badotti, Fernanda; Barbosa, Alan Sales; Reis, André Luiz Martins; do Valle, Italo Faria; Ambrósio, Lara; Bitar, Mainá

    2014-01-01

    The huge increase in data being produced in the genomic era has produced a need to incorporate computers into the research process. Sequence generation, its subsequent storage, interpretation, and analysis are now entirely computer-dependent tasks. Universities from all over the world have been challenged to seek a way of encouraging students to incorporate computational and bioinformatics skills since undergraduation in order to understand biological processes. The aim of this article is to report the experience of awakening students' interest in bioinformatics tools during a course focused on comparative modeling of proteins. The authors start by giving a full description of the course environmental context and students' backgrounds. Then they detail each class and present a general overview of the protein modeling protocol. The positive and negative aspects of the course are also reported, and some of the results generated in class and in projects outside the classroom are discussed. In the last section of the article, general perspectives about the course from students' point of view are given. This work can serve as a guide for professors who teach subjects for which bioinformatics tools are useful and for universities that plan to incorporate bioinformatics into the curriculum. PMID:24167006

  9. Abstractions, algorithms and data structures for structural bioinformatics in PyCogent

    PubMed Central

    Cieślik, Marcin; Derewenda, Zygmunt S.; Mura, Cameron

    2011-01-01

    To facilitate flexible and efficient structural bioinformatics analyses, new functionality for three-dimensional structure processing and analysis has been introduced into PyCogent – a popular feature-rich framework for sequence-based bioinformatics, but one which has lacked equally powerful tools for handling stuctural/coordinate-based data. Extensible Python modules have been developed, which provide object-oriented abstractions (based on a hierarchical representation of macromolecules), efficient data structures (e.g. kD-trees), fast implementations of common algorithms (e.g. surface-area calculations), read/write support for Protein Data Bank-related file formats and wrappers for external command-line applications (e.g. Stride). Integration of this code into PyCogent is symbiotic, allowing sequence-based work to benefit from structure-derived data and, reciprocally, enabling structural studies to leverage PyCogent’s versatile tools for phylogenetic and evolutionary analyses. PMID:22479120

  10. Human, vector and parasite Hsp90 proteins: A comparative bioinformatics analysis

    PubMed Central

    Faya, Ngonidzashe; Penkler, David L.; Tastan Bishop, Özlem

    2015-01-01

    The treatment of protozoan parasitic diseases is challenging, and thus identification and analysis of new drug targets is important. Parasites survive within host organisms, and some need intermediate hosts to complete their life cycle. Changing host environment puts stress on parasites, and often adaptation is accompanied by the expression of large amounts of heat shock proteins (Hsps). Among Hsps, Hsp90 proteins play an important role in stress environments. Yet, there has been little computational research on Hsp90 proteins to analyze them comparatively as potential parasitic drug targets. Here, an attempt was made to gain detailed insights into the differences between host, vector and parasitic Hsp90 proteins by large-scale bioinformatics analysis. A total of 104 Hsp90 sequences were divided into three groups based on their cellular localizations; namely cytosolic, mitochondrial and endoplasmic reticulum (ER). Further, the parasitic proteins were divided according to the type of parasite (protozoa, helminth and ectoparasite). Primary sequence analysis, phylogenetic tree calculations, motif analysis and physicochemical properties of Hsp90 proteins suggested that despite the overall structural conservation of these proteins, parasitic Hsp90 proteins have unique features which differentiate them from human ones, thus encouraging the idea that protozoan Hsp90 proteins should be further analyzed as potential drug targets. PMID:26793431

  11. Computer programming and biomolecular structure studies: A step beyond internet bioinformatics.

    PubMed

    Likić, Vladimir A

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled Biomolecular Structure and Bioinformatics. Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics approach that relies on access to the Internet and biological databases. This was an ambitious approach considering that the students mostly had a biological background. There were also time constraints of eight lectures in total and two accompanying practical sessions. The main challenge was that students had to be introduced to computer programming from a beginner level and in a short time provided with enough knowledge to independently solve a simple bioinformatics problem. This was accomplished with a problem directly relevant to the rest of the subject, concerned with the structure-function relationships and experimental techniques for the determination of macromolecular structure. PMID:21638623

  12. NETTAB 2014: From high-throughput structural bioinformatics to integrative systems biology.

    PubMed

    Romano, Paolo; Cordero, Francesca

    2016-01-01

    The fourteenth NETTAB workshop, NETTAB 2014, was devoted to a range of disciplines going from structural bioinformatics, to proteomics and to integrative systems biology. The topics of the workshop were centred around bioinformatics methods, tools, applications, and perspectives for models, standards and management of high-throughput biological data, structural bioinformatics, functional proteomics, mass spectrometry, drug discovery, and systems biology.43 scientific contributions were presented at NETTAB 2014, including keynote, special guest and tutorial talks, oral communications, and posters. Full papers from some of the best contributions presented at the workshop were later submitted to a special Call for this Supplement.Here, we provide an overview of the workshop and introduce manuscripts that have been accepted for publication in this Supplement. PMID:26960985

  13. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function. PMID:21350909

  14. Comparing bioinformatic gene expression profiling methods: microarray and RNA-Seq.

    PubMed

    Mantione, Kirk J; Kream, Richard M; Kuzelova, Hana; Ptacek, Radek; Raboch, Jiri; Samuel, Joshua M; Stefano, George B

    2014-01-01

    Understanding the control of gene expression is critical for our understanding of the relationship between genotype and phenotype. The need for reliable assessment of transcript abundance in biological samples has driven scientists to develop novel technologies such as DNA microarray and RNA-Seq to meet this demand. This review focuses on comparing the two most useful methods for whole transcriptome gene expression profiling. Microarrays are reliable and more cost effective than RNA-Seq for gene expression profiling in model organisms. RNA-Seq will eventually be used more routinely than microarray, but right now the techniques can be complementary to each other. Microarrays will not become obsolete but might be relegated to only a few uses. RNA-Seq clearly has a bright future in bioinformatic data collection. PMID:25149683

  15. AWSEM-MD: Protein Structure Prediction Using Coarse-grained Physical Potentials and Bioinformatically Based Local Structure Biasing

    PubMed Central

    Davtyan, Aram; Schafer, Nicholas P.; Zheng, Weihua; Clementi, Cecilia; Wolynes, Peter G.; Papoian, Garegin A.

    2012-01-01

    The Associative memory, Water mediated, Structure and Energy Model (AWSEM) is a coarse-grained protein force field. AWSEM contains physically motivated terms, such as hydrogen bonding, as well as a bioinformatically based local structure biasing term, which efficiently takes into account many-body effects that are modulated by the local sequence. When combined with appropriate local or global alignments to choose memories, AWSEM can be used to perform de novo protein structure prediction. Herein we present structure prediction results for a particular choice of local sequence alignment method based on short residue sequences called fragments. We demonstrate the model’s structure prediction capabilities for three levels of global homology between the target sequence and those proteins used for local structure biasing, all of which assume that the structure of the target sequence is not known. When there are no homologs in the database of structures used for local structure biasing, AWSEM calculations produce structural predictions that are somewhat improved compared with prior works using related approaches. The inclusion of a small number of structures from homologous sequences improves structure prediction only marginally but when the fragment search is restricted to only homologous sequences, AWSEM can perform high resolution structure prediction and can be used for kinetics and dynamics studies. PMID:22545654

  16. Bioinformatics and variability in drug response: a protein structural perspective

    PubMed Central

    Lahti, Jennifer L.; Tang, Grace W.; Capriotti, Emidio; Liu, Tianyun; Altman, Russ B.

    2012-01-01

    Marketed drugs frequently perform worse in clinical practice than in the clinical trials on which their approval is based. Many therapeutic compounds are ineffective for a large subpopulation of patients to whom they are prescribed; worse, a significant fraction of patients experience adverse effects more severe than anticipated. The unacceptable risk–benefit profile for many drugs mandates a paradigm shift towards personalized medicine. However, prior to adoption of patient-specific approaches, it is useful to understand the molecular details underlying variable drug response among diverse patient populations. Over the past decade, progress in structural genomics led to an explosion of available three-dimensional structures of drug target proteins while efforts in pharmacogenetics offered insights into polymorphisms correlated with differential therapeutic outcomes. Together these advances provide the opportunity to examine how altered protein structures arising from genetic differences affect protein–drug interactions and, ultimately, drug response. In this review, we first summarize structural characteristics of protein targets and common mechanisms of drug interactions. Next, we describe the impact of coding mutations on protein structures and drug response. Finally, we highlight tools for analysing protein structures and protein–drug interactions and discuss their application for understanding altered drug responses associated with protein structural variants. PMID:22552919

  17. Comparative Bioinformatics Analyses and Profiling of Lysosome-Related Organelle Proteomes

    PubMed Central

    Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy

    2007-01-01

    Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for 7 lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles. PMID:17375895

  18. Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes

    NASA Astrophysics Data System (ADS)

    Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy

    2007-01-01

    Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.

  19. Structural bioinformatics study of PNP from Schistosoma mansoni.

    PubMed

    da Silveira, Nelson José Freitas; Uchôa, Hugo Brandão; Canduri, Fernanda; Pereira, José Henrique; Camera, João Carlos; Basso, Luiz Augusto; Palma, Mário Sergio; Santos, Diógenes Santiago; de Azevedo, Walter Filgueira

    2004-09-10

    The parasite Schistosoma mansoni lacks the de novo pathway for purine biosynthesis and depends on salvage pathways for its purine requirements. Schistosomiasis is endemic in 76 countries and territories and amongst the parasitic diseases ranks second after malaria in terms of social and economic impact and public health importance. The PNP is an attractive target for drug design and it has been submitted to extensive structure-based design. The atomic coordinates of the complex of human PNP with inosine were used as template for starting the modeling of PNP from S. mansoni complexed with inosine. Here we describe the model for the complex SmPNP-inosine and correlate the structure with differences in the affinity for inosine presented by human and S. mansoni PNPs. PMID:15313179

  20. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines.

    PubMed

    D'Argenio, Valeria; Casaburi, Giorgio; Precone, Vincenza; Salvatore, Francesco

    2014-01-01

    Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times. PMID:24719854

  1. Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery

    PubMed Central

    Blundell, Tom L; Sibanda, Bancinyane L; Montalvão, Rinaldo Wander; Brewerton, Suzanne; Chelliah, Vijayalakshmi; Worth, Catherine L; Harmer, Nicholas J; Davies, Owen; Burke, David

    2006-01-01

    Impressive progress in genome sequencing, protein expression and high-throughput crystallography and NMR has radically transformed the opportunities to use protein three-dimensional structures to accelerate drug discovery, but the quantity and complexity of the data have ensured a central place for informatics. Structural biology and bioinformatics have assisted in lead optimization and target identification where they have well established roles; they can now contribute to lead discovery, exploiting high-throughput methods of structure determination that provide powerful approaches to screening of fragment binding. PMID:16524830

  2. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    ERIC Educational Resources Information Center

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  3. Sequential and Structural Aspects of Antifungal Peptides from Animals, Bacteria and Fungi Based on Bioinformatics Tools.

    PubMed

    Neelabh; Singh, Karuna; Rani, Jyoti

    2016-06-01

    Emerging drug resistance varieties and hyper-virulent strains of microorganisms have compelled the scientific fraternity to develop more potent and less harmful therapeutics. Antimicrobial peptides could be one of such therapeutics. This review is an attempt to explore antifungal peptides naturally produced by prokaryotes as well as eukaryotes. They are components of innate immune system providing first line of defence against microbial attacks, especially in eukaryotes. The present article concentrates on types, structures, sources and mode of action of gene-encoded antifungal peptides such as mammalian defensins, protegrins, tritrpticins, histatins, lactoferricins, antifungal peptides derived from birds, amphibians, insects, fungi, bacteria and their synthetic analogues such as pexiganan, omiganan, echinocandins and Novexatin. In silico drug designing, a major revolution in the area of therapeutics, facilitates drug development by exploiting different bioinformatics tools. With this view, bioinformatics tools were used to visualize the structural details of antifungal peptides and to predict their level of similarity. Current practices and recent developments in this area have also been discussed briefly. PMID:27060002

  4. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    PubMed Central

    Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun

    2014-01-01

    While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285

  5. Comparative bioinformatics, temporal and spatial expression analyses of Ixodes scapularis organic anion transporting polypeptides

    PubMed Central

    Radulović, Željko; Porter, Lindsay M.; Kim, Tae K.; Mulenga, Albert

    2015-01-01

    Organic anion-transporting polypeptides (Oatps) are an integral part of the detoxification mechanism in vertebrates and invertebrates. These cell surface proteins are involved in mediating the sodium-independent uptake and/or distribution of a broad array of organic amphipathic compounds and xenobiotic drugs. This study describes bioinformatics and biological characterization of 9 Oatp sequences in the Ixodes scapularis genome. These sequences have been annotated on the basis of 12 transmembrane domains, consensus motif D-X-RW-(I,V)-GAWW-X-G-(F,L)-L, and 11 conserved cysteine amino acid residues in the large extracellular loop 5 that characterize the Oatp superfamily. Ixodes scapularis Oatps may regulate non-redundant cross-tick species conserved functions in that they did not cluster as a monolithic group on the phylogeny tree and that they have orthologs in other ticks. Phylogeny clustering patterns also suggest that some tick Oatp sequences transport substrates that are similar to those of body louse, mosquito, eye worm, and filarial worm Oatps. Semi-quantitative RT-PCR analysis demonstrated that all 9 I. scapularis Oatp sequences were expressed during tick feeding. Ixodes scapularis Oatp genes potentially regulate functions during early and/or late-stage tick feeding as revealed by normalized mRNA profiles. Normalized transcript abundance indicates that I. scapularis Oatp genes are strongly expressed in unfed ticks during the first 24 h of feeding and/or at the end of the tick feeding process. Except for 2 I. scapularis Oatps, which were expressed in the salivary glands and ovaries, all other genes were expressed in all tested organs, suggesting the significance of I. scapularis Oatps in maintaining tick homeostasis. Different I. scapularis Oatp mRNA expression patterns were detected and discussed with reference to different physiological states of unfed and feeding ticks. PMID:24582512

  6. An Introductory Bioinformatics Exercise to Reinforce Gene Structure and Expression and Analyze the Relationship between Gene and Protein Sequences

    ERIC Educational Resources Information Center

    Almeida, Craig A.; Tardiff, Daniel F.; De Luca, Jane P.

    2004-01-01

    We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product.…

  7. Determination of Lipid-Protein Interactions in Lung Surfactants Using Computer Simulations and Structural Bioinformatics.

    NASA Astrophysics Data System (ADS)

    Kaznessis, Yiannis

    2001-06-01

    Proteins are the primary components of the networks that conduct the flows of mass, energy and information in living organisms. The discovery of the principles of protein structure and function allows the development of design rules for biological activities. The microscopic nature of the operating mechanisms of protein activity, and the vast complexity of the networks of interaction call for the employment of powerful computational methodologies that can decipher the physicochemical and evolutionary principles underlying protein structure and function. An example will be presented that reflects the strength of computational approaches. Atomistic molecular dynamics simulations and structural bioinformatics tools are employed to investigate the interactions between the first 25 N-terminal residues of surfactant protein B (SP-B 1-25) and the lipid components of the lung surfactant (LS). An understanding of the molecular level interactions between the LS components is essential for the establishment of design rules for the development of synthetic LS and the treatment of the neonatal respiratory distress syndrome, which results from deficiency or inactivation of LS.

  8. Edge Bioinformatics

    Energy Science and Technology Software Center (ESTSC)

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in amore » genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance« less

  9. Edge Bioinformatics

    SciTech Connect

    Lo, Chien-Chi

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance

  10. Structural, Bioinformatic, and In Vivo Analyses of Two Treponema pallidum Lipoproteins Reveal a Unique TRAP Transporter

    SciTech Connect

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-05-25

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP-independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP), and tp0958 (the symporter), are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of 'tetratricopeptide repeat' (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPATs (TPR-protein-associated TRAP-Ts) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s).

  11. Structural, bioinformatic, and in vivo analyses of two Treponema pallidum lipoproteins reveal a unique TRAP transporter

    PubMed Central

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-01-01

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP- independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP) and tp0958 (the symporter) are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of “tetratricopeptide repeat” (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPR-protein associated TRAP transporters (TPATs) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s). PMID:22306465

  12. Structural and Phylogenetic Analysis of Laccases from Trichoderma: A Bioinformatic Approach

    PubMed Central

    Cázares-García, Saila Viridiana; Vázquez-Garcidueñas, Ma. Soledad; Vázquez-Marrufo, Gerardo

    2013-01-01

    The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential. PMID:23383142

  13. A Bioinformatics Approach to the Structure, Function, and Evolution of the Nucleoprotein of the Order Mononegavirales

    PubMed Central

    Cleveland, Sean B.; Davies, John; McClure, Marcella A.

    2011-01-01

    The goal of this Bioinformatic study is to investigate sequence conservation in relation to evolutionary function/structure of the nucleoprotein of the order Mononegavirales. In the combined analysis of 63 representative nucleoprotein (N) sequences from four viral families (Bornaviridae, Filoviridae, Rhabdoviridae, and Paramyxoviridae) we predict the regions of protein disorder, intra-residue contact and co-evolving residues. Correlations between location and conservation of predicted regions illustrate a strong division between families while high- lighting conservation within individual families. These results suggest the conserved regions among the nucleoproteins, specifically within Rhabdoviridae and Paramyxoviradae, but also generally among all members of the order, reflect an evolutionary advantage in maintaining these sites for the viral nucleoprotein as part of the transcription/replication machinery. Results indicate conservation for disorder in the C-terminus region of the representative proteins that is important for interacting with the phosphoprotein and the large subunit polymerase during transcription and replication. Additionally, the C-terminus region of the protein preceding the disordered region, is predicted to be important for interacting with the encapsidated genome. Portions of the N-terminus are responsible for N∶N stability and interactions identified by the presence or lack of co-evolving intra-protein contact predictions. The validation of these prediction results by current structural information illustrates the benefits of the Disorder, Intra-residue contact and Compensatory mutation Correlator (DisICC) pipeline as a method for quickly characterizing proteins and providing the most likely residues and regions necessary to target for disruption in viruses that have little structural information available. PMID:21559282

  14. Structural Bioinformatics-Based Prediction of Exceptional Selectivity of p38 MAP Kinase Inhibitor PH-797804

    SciTech Connect

    Xing, Li; Shieh, Huey S.; Selness, Shaun R.; Devraj, Rajesh V.; Walker, John K.; Devadas, Balekudru; Hope, Heidi R.; Compton, Robert P.; Schindler, John F.; Hirsch, Jeffrey L.; Benson, Alan G.; Kurumbail, Ravi G.; Stegeman, Roderick A.; Williams, Jennifer M.; Broadus, Richard M.; Walden, Zara; Monahan, Joseph B.; Pfizer

    2009-07-24

    PH-797804 is a diarylpyridinone inhibitor of p38{alpha} mitogen-activated protein (MAP) kinase derived from a racemic mixture as the more potent atropisomer (aS), first proposed by molecular modeling and subsequently confirmed by experiments. On the basis of structural comparison with a different biaryl pyrazole template and supported by dozens of high-resolution crystal structures of p38{alpha} inhibitor complexes, PH-797804 is predicted to possess a high level of specificity across the broad human kinase genome. We used a structural bioinformatics approach to identify two selectivity elements encoded by the TXXXG sequence motif on the p38{alpha} kinase hinge: (i) Thr106 that serves as the gatekeeper to the buried hydrophobic pocket occupied by 2,4-difluorophenyl of PH-797804 and (ii) the bidentate hydrogen bonds formed by the pyridinone moiety with the kinase hinge requiring an induced 180{sup o} rotation of the Met109-Gly110 peptide bond. The peptide flip occurs in p38{alpha} kinase due to the critical glycine residue marked by its conformational flexibility. Kinome-wide sequence mining revealed rare presentation of the selectivity motif. Corroboratively, PH-797804 exhibited exceptionally high specificity against MAP kinases and the related kinases. No cross-reactivity was observed in large panels of kinase screens (selectivity ratio of >500-fold). In cellular assays, PH-797804 demonstrated superior potency and selectivity consistent with the biochemical measurements. PH-797804 has met safety criteria in human phase I studies and is under clinical development for several inflammatory conditions. Understanding the rationale for selectivity at the molecular level helps elucidate the biological function and design of specific p38{alpha} kinase inhibitors.

  15. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics.

    PubMed

    Veneman, Wouter J; de Sonneville, Jan; van der Kolk, Kees-Jan; Ordas, Anita; Al-Ars, Zaid; Meijer, Annemarie H; Spaink, Herman P

    2015-03-01

    We present a RNA deep sequencing (RNAseq) analysis of a comparison of the transcriptome responses to infection of zebrafish larvae with Staphylococcus epidermidis and Mycobacterium marinum bacteria. We show how our developed GeneTiles software can improve RNAseq analysis approaches by more confidently identifying a large set of markers upon infection with these bacteria. For analysis of RNAseq data currently, software programs such as Bowtie2 and Samtools are indispensable. However, these programs that are designed for a LINUX environment require some dedicated programming skills and have no options for visualisation of the resulting mapped sequence reads. Especially with large data sets, this makes the analysis time consuming and difficult for non-expert users. We have applied the GeneTiles software to the analysis of previously published and newly obtained RNAseq datasets of our zebrafish infection model, and we have shown the applicability of this approach also to published RNAseq datasets of other organisms by comparing our data with a published mammalian infection study. In addition, we have implemented the DEXSeq module in the GeneTiles software to identify genes, such as glucagon A, that are differentially spliced under infection conditions. In the analysis of our RNAseq data, this has led to the possibility to improve the size of data sets that could be efficiently compared without using problem-dedicated programs, leading to a quick identification of marker sets. Therefore, this approach will also be highly useful for transcriptome analyses of other organisms for which well-characterised genomes are available. PMID:25503064

  16. Structural templates for comparative protein docking

    PubMed Central

    Anishchenko, Ivan; Kundrotas, Petras J.; Tuzikov, Alexander V.; Vakser, Ilya A.

    2014-01-01

    Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, non-redundant library of templates containing 4,950 full structures of binary complexes and 5,936 protein-protein interfaces extracted from the full structures at 12Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu. PMID:25488330

  17. Bioinformatics investigation of therapeutic mechanisms of Xuesaitong capsule treating ischemic cerebrovascular rat model with comparative transcriptome analysis

    PubMed Central

    Liao, Jiangquan; Wei, Benjun; Chen, Hengwen; Liu, Yongmei; Wang, Jie

    2016-01-01

    Background: Xuesaitong soft capsule (XST) which consists of panax notoginseng saponin (PNS) has been used to treat ischemic cerebrovascular diseases in China. The therapeutic mechanism of XST has not been elucidated yet from prospective of genomics and bioinformatics. Methods: A transcriptome analysis was performed to review series concerning middle cerebral artery occlusion (MCAO) rat model and XST intervention after MCAO from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were compared between blank group and model group, model group and XST group. Functional enrichment and pathway analysis were performed. Protein-Protein interaction network was constructed. The overlapping genes from two DEGs sets were screened out and profound analysis was performed. Results: Two series including 22 samples were obtained. 870 DEGs were identified between blank group and model group, and 1189 DEGs were identified between model group and XST group. GO terms and KEGG pathways of MCAO and XST intervention were significantly enriched. PPI networks were constructed to demonstrate the gene-gene interactions. The overlapping genes from two DEGs sets were highlighted. ANTXR2, FHL3, PRCP, TYROBP, TAF9B, FGFR2, BCL11B, RB1CC1 and MBNL2 were the pivotal genes and possible action sites of XST therapeutic mechanisms. Conclusion: MCAO is a pathological process with multiple. PMID:27347353

  18. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  19. Using structural bioinformatics to investigate the impact of non synonymous SNPs and disease mutations: scope and limitations

    PubMed Central

    Reumers, Joke; Schymkowitz, Joost; Rousseau, Fréderic

    2009-01-01

    Background Linking structural effects of mutations to functional outcomes is a major issue in structural bioinformatics, and many tools and studies have shown that specific structural properties such as stability and residue burial can be used to distinguish neutral variations and disease associated mutations. Results We have investigated 39 structural properties on a set of SNPs and disease mutations from the Uniprot Knowledge Base that could be mapped on high quality crystal structures and show that none of these properties can be used as a sole classification criterion to separate the two data sets. Furthermore, we have reviewed the annotation process from mutation to result and identified the liabilities in each step. Conclusion Although excellent annotation results of various research groups underline the great potential of using structural bioinformatics to investigate the mechanisms underlying disease, the interpretation of such annotations cannot always be extrapolated to proteome wide variation studies. Difficulties for large-scale studies can be found both on the technical level, i.e. the scarcity of data and the incompleteness of the structural tool suites, and on the conceptual level, i.e. the correct interpretation of the results in a cellular context. PMID:19758473

  20. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches.

    PubMed

    Ozyigit, Ibrahim I; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y; Koc, Ibrahim; Öztürk, Münir X; Anjum, Naser A

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  1. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches

    PubMed Central

    Ozyigit, Ibrahim I.; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y.; Koc, Ibrahim; Öztürk, Münir X.; Anjum, Naser A.

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  2. Bioinformatics based structural characterization of glucose dehydrogenase (gdh) gene and growth promoting activity of Leclercia sp. QAU-66

    PubMed Central

    Naveed, Muhammad; Ahmed, Iftikhar; Khalid, Nauman; Mumtaz, Abdul Samad

    2014-01-01

    Glucose dehydrogenase (GDH; EC 1.1. 5.2) is the member of quinoproteins group that use the redox cofactor pyrroloquinoline quinoine, calcium ions and glucose as substrate for its activity. In present study, Leclercia sp. QAU-66, isolated from rhizosphere of Vigna mungo, was characterized for phosphate solubilization and the role of GDH in plant growth promotion of Phaseolus vulgaris. The strain QAU-66 had ability to solubilize phosphorus and significantly (p ≤ 0.05) promoted the shoot and root lengths of Phaseolus vulgaris. The structural determination of GDH protein was carried out using bioinformatics tools like Pfam, InterProScan, I-TASSER and COFACTOR. These tools predicted the structural based functional homology of pyrroloquinoline quinone domains in GDH. GDH of Leclercia sp. QAU-66 is one of the main factor that involved in plant growth promotion and provides a solid background for further research in plant growth promoting activities. PMID:25242947

  3. Bioinformatics based structural characterization of glucose dehydrogenase (gdh) gene and growth promoting activity of Leclercia sp. QAU-66.

    PubMed

    Naveed, Muhammad; Ahmed, Iftikhar; Khalid, Nauman; Mumtaz, Abdul Samad

    2014-01-01

    Glucose dehydrogenase (GDH; EC 1.1. 5.2) is the member of quinoproteins group that use the redox cofactor pyrroloquinoline quinoine, calcium ions and glucose as substrate for its activity. In present study, Leclercia sp. QAU-66, isolated from rhizosphere of Vigna mungo, was characterized for phosphate solubilization and the role of GDH in plant growth promotion of Phaseolus vulgaris. The strain QAU-66 had ability to solubilize phosphorus and significantly (p ≤ 0.05) promoted the shoot and root lengths of Phaseolus vulgaris. The structural determination of GDH protein was carried out using bioinformatics tools like Pfam, InterProScan, I-TASSER and COFACTOR. These tools predicted the structural based functional homology of pyrroloquinoline quinone domains in GDH. GDH of Leclercia sp. QAU-66 is one of the main factor that involved in plant growth promotion and provides a solid background for further research in plant growth promoting activities. PMID:25242947

  4. Exploring the immunogenome with bioinformatics.

    PubMed

    de Bono, Bernard; Trowsdale, John

    2003-08-01

    A better description of the immune system can be afforded if the latest developments in bioinformatics are applied to integrate sequence with structure and function. Clear guidelines for the upgrade of the bioinformatic capability of the immunogenetics laboratory are discussed in the light of more powerful methods to detect homology, combined approaches to predict the three dimensional properties of a protein and a robust strategy to represent the biological role of a gene. PMID:14690048

  5. Structural Bioinformatics Inspection of neXtProt PE5 Proteins in the Human Proteome.

    PubMed

    Dong, Qiwen; Menon, Rajasree; Omenn, Gilbert S; Zhang, Yang

    2015-09-01

    One goal of the Human Proteome Project is to identify at least one protein product for each of the ∼20,000 human protein-coding genes. As of October 2014, however, there are 3564 genes (18%) that have no or insufficient evidence of protein existence (PE), as curated by neXtProt; these comprise 2647 PE2-4 missing proteins and 616 PE5 dubious protein entries. We conducted a systematic examination of the 616 PE5 protein entries using cutting-edge protein structure and function modeling methods. Compared to a random sample of high-confidence PE1 proteins, the putative PE5 proteins were found to be over-represented in the membrane and cell surface proteins and peptides fold families. Detailed functional analyses show that most PE5 proteins, if expressed, would belong to transporters and receptors localized in the plasma membrane compartment. The results suggest that experimental difficulty in identifying membrane-bound proteins and peptides could have precluded their detection in mass spectrometry and that special enrichment techniques with improved sensitivity for membrane proteins could be important for the characterization of the PE5 "dark matter" of the human proteome. Finally, we identify 66 high scoring PE5 protein entries and find that six of them were reported in recent mass spectrometry databases; an illustrative annotation of these six is provided. This work illustrates a new approach to examine the potential folding and function of the dubious proteins comprising PE5, which we will next apply to the far larger group of missing proteins comprising PE2-4. PMID:26193931

  6. In the Spotlight: Bioinformatics

    PubMed Central

    Wang, May Dongmei

    2016-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on the optimization of NGS platforms, sequence alignment and assembly algorithms, data analytics, novel algorithms for integrating NGS data with existing genomic, proteomic, or metabolomic data, and quantitative assessment of NGS technology in comparing to more established technologies such as microarrays. NGS technology has been predicated to become a cornerstone of personalized medicine. It is argued that NGS is a promising field for motivated young researchers who are looking for opportunities in bioinformatics. PMID:23192635

  7. Elongation Factor-Tu (EF-Tu) proteins structural stability and bioinformatics in ancestral gene reconstruction

    NASA Astrophysics Data System (ADS)

    Dehipawala, Sunil; Nguyen, A.; Tremberger, G.; Cheung, E.; Schneider, P.; Lieberman, D.; Holden, T.; Cheung, T.

    2013-09-01

    A paleo-experimental evolution report on elongation factor EF-Tu structural stability results has provided an opportunity to rewind the tape of life using the ancestral protein sequence reconstruction modeling approach; consistent with the book of life dogma in current biology and being an important component in the astrobiology community. Fractal dimension via the Higuchi fractal method and Shannon entropy of the DNA sequence classification could be used in a diagram that serves as a simple summary. Results from biomedical gene research provide examples on the diagram methodology. Comparisons between biomedical genes such as EEF2 (elongation factor 2 human, mouse, etc), WDR85 in epigenetics, HAR1 in human specificity, DLG1 in cognitive skill, and HLA-C in mosquito bite immunology with EF Tu DNA sequences have accounted for the reported circular dichroism thermo-stability data systematically; the results also infer a relatively less volatility geologic time period from 2 to 3 Gyr from adaptation viewpoint. Comparison to Thermotoga maritima MSB8 and Psychrobacter shows that Thermus thermophilus HB8 EF-Tu calibration sequence could be an outlier, consistent with free energy calculation by NUPACK. Diagram methodology allows computer simulation studies and HAR1 shows about 0.5% probability from chimp to human in terms of diagram location, and SNP simulation results such as amoebic meningoencephalitis NAF1 suggest correlation. Extensions to the studies of the translation and transcription elongation factor sequences in Megavirus Chiliensis, Megavirus Lba and Pandoravirus show that the studied Pandoravirus sequence could be an outlier with the highest fractal dimension and lowest entropy, as compared to chicken as a deviant in the DNMT3A DNA methylation gene sequences from zebrafish to human and to the less than one percent probability in computer simulation using the HAR1 0.5% probability as reference. The diagram methodology would be useful in ancestral gene

  8. Crowdsourcing for bioinformatics

    PubMed Central

    Good, Benjamin M.; Su, Andrew I.

    2013-01-01

    Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614

  9. Minimal Functional Sites in Metalloproteins and Their Usage in Structural Bioinformatics

    PubMed Central

    Rosato, Antonio; Valasatava, Yana; Andreini, Claudia

    2016-01-01

    Metal ions play a functional role in numerous biochemical processes and cellular pathways. Indeed, about 40% of all enzymes of known 3D structure require a metal ion to be able to perform catalysis. The interactions of the metals with the macromolecular framework determine their chemical properties and reactivity. The relevant interactions involve both the coordination sphere of the metal ion and the more distant interactions of the so-called second sphere, i.e., the non-bonded interactions between the macromolecule and the residues coordinating the metal (metal ligands). The metal ligands and the residues in their close spatial proximity define what we call a minimal functional site (MFS). MFSs can be automatically extracted from the 3D structures of metal-binding biological macromolecules deposited in the Protein Data Bank (PDB). They are 3D templates that describe the local environment around a metal ion or metal cofactor and do not depend on the overall macromolecular structure. MFSs provide a different view on metal-binding proteins and nucleic acids, completely focused on the metal. Here we present different protocols and tools based upon the concept of MFS to obtain deeper insight into the structural and functional properties of metal-binding macromolecules. We also show that structure conservation of MFSs in metalloproteins relates to local sequence similarity more strongly than to overall protein similarity. PMID:27153067

  10. Bioinformatic Screening of Autoimmune Disease Genes and Protein Structure Prediction with FAMS for Drug Discovery

    PubMed Central

    Ishida, Shigeharu; Umeyama, Hideaki; Iwadate, Mitsuo; Y-h, Taguchi

    2014-01-01

    Autoimmune diseases are often intractable because their causes are unknown. Identifying which genes contribute to these diseases may allow us to understand the pathogenesis, but it is difficult to determine which genes contribute to disease. Recently, epigenetic information has been considered to activate/deactivate disease-related genes. Thus, it may also be useful to study epigenetic information that differs between healthy controls and patients with autoimmune disease. Among several types of epigenetic information, promoter methylation is believed to be one of the most important factors. Here, we propose that principal component analysis is useful to identify specific gene promoters that are differently methylated between the normal healthy controls and patients with autoimmune disease. Full Automatic Modeling System (FAMS) was used to predict the three-dimensional structures of selected proteins and successfully inferred relatively confident structures. Several possibilities of the application to the drug discovery based on obtained structures are discussed. PMID:23855671

  11. DOE EPSCoR Initiative in Structural and computational Biology/Bioinformatics

    SciTech Connect

    Wallace, Susan S.

    2008-02-21

    The overall goal of the DOE EPSCoR Initiative in Structural and Computational Biology was to enhance the competiveness of Vermont research in these scientific areas. To develop self-sustaining infrastructure, we increased the critical mass of faculty, developed shared resources that made junior researchers more competitive for federal research grants, implemented programs to train graduate and undergraduate students who participated in these research areas and provided seed money for research projects. During the time period funded by this DOE initiative: (1) four new faculty were recruited to the University of Vermont using DOE resources, three in Computational Biology and one in Structural Biology; (2) technical support was provided for the Computational and Structural Biology facilities; (3) twenty-two graduate students were directly funded by fellowships; (4) fifteen undergraduate students were supported during the summer; and (5) twenty-eight pilot projects were supported. Taken together these dollars resulted in a plethora of published papers, many in high profile journals in the fields and directly impacted competitive extramural funding based on structural or computational biology resulting in 49 million dollars awarded in grants (Appendix I), a 600% return on investment by DOE, the State and University.

  12. Structural and bioinformatic characterization of an Acinetobacter baumannii type II carrier protein

    SciTech Connect

    Allen, C. Leigh; Gulick, Andrew M.

    2014-06-01

    The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented. Microorganisms produce a variety of natural products via secondary metabolic biosynthetic pathways. Two of these types of synthetic systems, the nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), use large modular enzymes containing multiple catalytic domains in a single protein. These multidomain enzymes use an integrated carrier protein domain to transport the growing, covalently bound natural product to the neighboring catalytic domains for each step in the synthesis. Interestingly, some PKS and NRPS clusters contain free-standing domains that interact intermolecularly with other proteins. Being expressed outside the architecture of a multi-domain protein, these so-called type II proteins present challenges to understand the precise role they play. Additional structures of individual and multi-domain components of the NRPS enzymes will therefore provide a better understanding of the features that govern the domain interactions in these interesting enzyme systems. The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented here. Comparison with the closest structural homologs of other carrier proteins identifies the requirements for a conserved glycine residue and additional important sequence and structural requirements within the regions that interact with partner proteins.

  13. Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models.

    PubMed

    Hosseinzadeh, Faezeh; Ebrahimi, Mansour; Goliaei, Bahram; Shamabadi, Narges

    2012-01-01

    Rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important in diagnosis of this disease. Furthermore sequence-derived structural and physicochemical descriptors are very useful for machine learning prediction of protein structural and functional classes, classifying proteins and the prediction performance. Herein, in this study is the classification of lung tumors based on 1497 attributes derived from structural and physicochemical properties of protein sequences (based on genes defined by microarray analysis) investigated through a combination of attribute weighting, supervised and unsupervised clustering algorithms. Eighty percent of the weighting methods selected features such as autocorrelation, dipeptide composition and distribution of hydrophobicity as the most important protein attributes in classification of SCLC, NSCLC and COMMON classes of lung tumors. The same results were observed by most tree induction algorithms while descriptors of hydrophobicity distribution were high in protein sequences COMMON in both groups and distribution of charge in these proteins was very low; showing COMMON proteins were very hydrophobic. Furthermore, compositions of polar dipeptide in SCLC proteins were higher than NSCLC proteins. Some clustering models (alone or in combination with attribute weighting algorithms) were able to nearly classify SCLC and NSCLC proteins. Random Forest tree induction algorithm, calculated on leaves one-out and 10-fold cross validation) shows more than 86% accuracy in clustering and predicting three different lung cancer tumors. Here for the first time the application of data mining tools to effectively classify three classes of lung cancer tumors regarding the importance of dipeptide composition, autocorrelation and distribution descriptor has been reported. PMID:22829872

  14. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15

    PubMed Central

    Wang, Jinlan; Chang, Fen

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein–protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  15. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15.

    PubMed

    Wang, Jinlan; Zhang, Zheng; Chang, Fen; Yin, Deling

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein-protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  16. The CopC Family: Structural and Bioinformatic Insights into a Diverse Group of Periplasmic Copper Binding Proteins.

    PubMed

    Lawton, Thomas J; Kenney, Grace E; Hurley, Joseph D; Rosenzweig, Amy C

    2016-04-19

    The CopC proteins are periplasmic copper binding proteins believed to play a role in bacterial copper homeostasis. Previous studies have focused on CopCs that are part of seven-protein Cop or Pco systems involved in copper resistance. These canonical CopCs contain distinct Cu(I) and Cu(II) binding sites. Mounting evidence suggests that CopCs are more widely distributed, often present only with the CopD inner membrane protein, frequently as a fusion protein, and that the CopC and CopD proteins together function in the uptake of copper to the cytoplasm. In the methanotroph Methylosinus trichosporium OB3b, genes encoding a CopCD pair are located adjacent to the particulate methane monooxygenase (pMMO) operon. The CopC from this organism (Mst-CopC) was expressed, purified, and structurally characterized. The 1.46 Å resolution crystal structure of Mst-CopC reveals a single Cu(II) binding site with coordination somewhat different from that in canonical CopCs, and the absence of a Cu(I) binding site. Extensive bioinformatic analyses indicate that the majority of CopCs in fact contain only a Cu(II) site, with just 10% of sequences corresponding to the canonical two-site CopC. Accordingly, a new classification scheme for CopCs was developed, and detailed analyses of the sequences and their genomic neighborhoods reveal new proteins potentially involved in copper homeostasis, providing a framework for expanded models of CopCD function. PMID:27010565

  17. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  18. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  19. Bioinformatics-Aided Venomics

    PubMed Central

    Kaas, Quentin; Craik, David J.

    2015-01-01

    Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future. PMID:26110505

  20. A bioinformatics approach for integrated transcriptomic and proteomic comparative analyses of model and non-sequenced anopheline vectors of human malaria parasites.

    PubMed

    Ubaida Mohien, Ceereena; Colquhoun, David R; Mathias, Derrick K; Gibbons, John G; Armistead, Jennifer S; Rodriguez, Maria C; Rodriguez, Mario Henry; Edwards, Nathan J; Hartler, Jürgen; Thallinger, Gerhard G; Graham, David R; Martinez-Barnetche, Jesus; Rokas, Antonis; Dinglasan, Rhoel R

    2013-01-01

    Malaria morbidity and mortality caused by both Plasmodium falciparum and Plasmodium vivax extend well beyond the African continent, and although P. vivax causes between 80 and 300 million severe cases each year, vivax transmission remains poorly understood. Plasmodium parasites are transmitted by Anopheles mosquitoes, and the critical site of interaction between parasite and host is at the mosquito's luminal midgut brush border. Although the genome of the "model" African P. falciparum vector, Anopheles gambiae, has been sequenced, evolutionary divergence limits its utility as a reference across anophelines, especially non-sequenced P. vivax vectors such as Anopheles albimanus. Clearly, technologies and platforms that bridge this substantial scientific gap are required in order to provide public health scientists with key transcriptomic and proteomic information that could spur the development of novel interventions to combat this disease. To our knowledge, no approaches have been published that address this issue. To bolster our understanding of P. vivax-An. albimanus midgut interactions, we developed an integrated bioinformatic-hybrid RNA-Seq-LC-MS/MS approach involving An. albimanus transcriptome (15,764 contigs) and luminal midgut subproteome (9,445 proteins) assembly, which, when used with our custom Diptera protein database (685,078 sequences), facilitated a comparative proteomic analysis of the midgut brush borders of two important malaria vectors, An. gambiae and An. albimanus. PMID:23082028

  1. On comparing two structured RNA multiple alignments.

    PubMed

    Patel, Vandanaben; Wang, Jason T L; Setia, Shefali; Verma, Anurag; Warden, Charles D; Zhang, Kaizhong

    2010-12-01

    We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike many other tools that can perform pairwise alignment of either single sequences or structures only, BlockMatch takes into account the characteristics of all the sequences in the blocks along with their consensus structures during the alignment process, thus being able to achieve a high-quality alignment result. We apply BlockMatch to phylogeny reconstruction on a set of 5S rRNA sequences taken from fifteen bacteria species. Experimental results showed that the phylogenetic tree generated by our method is more accurate than the tree constructed based on the widely used ClustalW tool. The BlockMatch algorithm is implemented into a web server, accessible at http://bioinformatics.njit.edu/blockmatch. A jar file of the program is also available for download from the web server. PMID:21121021

  2. Channelrhodopsins: a bioinformatics perspective.

    PubMed

    Del Val, Coral; Royuela-Flor, José; Milenkovic, Stefan; Bondar, Ana-Nicoleta

    2014-05-01

    Channelrhodopsins are microbial-type rhodopsins that function as light-gated cation channels. Understanding how the detailed architecture of the protein governs its dynamics and specificity for ions is important, because it has the potential to assist in designing site-directed channelrhodopsin mutants for specific neurobiology applications. Here we use bioinformatics methods to derive accurate alignments of channelrhodopsin sequences, assess the sequence conservation patterns and find conserved motifs in channelrhodopsins, and use homology modeling to construct three-dimensional structural models of channelrhodopsins. The analyses reveal that helices C and D of channelrhodopsins contain Cys, Ser, and Thr groups that can engage in both intra- and inter-helical hydrogen bonds. We propose that these polar groups participate in inter-helical hydrogen-bonding clusters important for the protein conformational dynamics and for the local water interactions. This article is part of a Special Issue entitled: Retinal Proteins - You can teach an old dog new tricks. PMID:24252597

  3. Bioinformatics and Moonlighting Proteins

    PubMed Central

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein–protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations – it requires the existence of multialigned family protein sequences – but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  4. Bioinformatics and Moonlighting Proteins.

    PubMed

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein-protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations - it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  5. The origins of bioinformatics.

    PubMed

    Hagen, J B

    2000-12-01

    Bioinformatics is often described as being in its infancy, but computers emerged as important tools in molecular biology during the early 1960s. A decade before DNA sequencing became feasible, computational biologists focused on the rapidly accumulating data from protein biochemistry. Without the benefits of super computers or computer networks, these scientists laid important conceptual and technical foundations for bioinformatics today. PMID:11252753

  6. Computational intelligence techniques in bioinformatics.

    PubMed

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included. PMID:23891719

  7. Bioinformatics in protein analysis.

    PubMed

    Persson, B

    2000-01-01

    The chapter gives an overview of bioinformatic techniques of importance in protein analysis. These include database searches, sequence comparisons and structural predictions. Links to useful World Wide Web (WWW) pages are given in relation to each topic. Databases with biological information are reviewed with emphasis on databases for nucleotide sequences (EMBL, GenBank, DDBJ), genomes, amino acid sequences (Swissprot, PIR, TrEMBL, GenePept), and three-dimensional structures (PDB). Integrated user interfaces for databases (SRS and Entrez) are described. An introduction to databases of sequence patterns and protein families is also given (Prosite, Pfam, Blocks). Furthermore, the chapter describes the widespread methods for sequence comparisons, FASTA and BLAST, and the corresponding WWW services. The techniques involving multiple sequence alignments are also reviewed: alignment creation with the Clustal programs, phylogenetic tree calculation with the Clustal or Phylip packages and tree display using Drawtree, njplot or phylo_win. Finally, the chapter also treats the issue of structural prediction. Different methods for secondary structure predictions are described (Chou-Fasman, Garnier-Osguthorpe-Robson, Predator, PHD). Techniques for predicting membrane proteins, antigenic sites and postranslational modifications are also reviewed. PMID:10803381

  8. Bioinformatics Visualisation Tools: An Unbalanced Picture.

    PubMed

    Broască, Laura; Ancuşa, Versavia; Ciocârlie, Horia

    2016-01-01

    Visualization tools represent a key element in triggering human creativity while being supported with the analysis power of the machine. This paper analyzes free network visualization tools for bioinformatics, frames them in domain specific requirements and compares them. PMID:27577488

  9. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    PubMed

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-01

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. PMID:27318307

  10. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  11. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  12. Global computing for bioinformatics.

    PubMed

    Loewe, Laurence

    2002-12-01

    Global computing, the collaboration of idle PCs via the Internet in a SETI@home style, emerges as a new way of massive parallel multiprocessing with potentially enormous CPU power. Its relations to the broader, fast-moving field of Grid computing are discussed without attempting a review of the latter. This review (i) includes a short table of milestones in global computing history, (ii) lists opportunities global computing offers for bioinformatics, (iii) describes the structure of problems well suited for such an approach, (iv) analyses the anatomy of successful projects and (v) points to existing software frameworks. Finally, an evaluation of the various costs shows that global computing indeed has merit, if the problem to be solved is already coded appropriately and a suitable global computing framework can be found. Then, either significant amounts of computing power can be recruited from the general public, or--if employed in an enterprise-wide Intranet for security reasons--idle desktop PCs can substitute for an expensive dedicated cluster. PMID:12511066

  13. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    PubMed Central

    Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  14. A survey of scholarly literature describing the field of bioinformatics education and bioinformatics educational research.

    PubMed

    Magana, Alejandra J; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students' attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  15. Comparative Protein Structure Modeling Using Modeller

    PubMed Central

    Eswar, Narayanan; Marti-Renom, Marc A.; Madhusudhan, M.S.; Eramian, David; Shen, Min-yi; Pieper, Ursula

    2014-01-01

    Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. PMID:18428767

  16. Current trends in antimicrobial agent research: chemo- and bioinformatics approaches.

    PubMed

    Hammami, Riadh; Fliss, Ismail

    2010-07-01

    Databases and chemo- and bioinformatics tools that contain genomic, proteomic and functional information have become indispensable for antimicrobial drug research. The combination of chemoinformatics tools, bioinformatics tools and relational databases provides means of analyzing, linking and comparing online search results. The development of computational tools feeds on a diversity of disciplines, including mathematics, statistics, computer science, information technology and molecular biology. The computational approach to antimicrobial agent discovery and design encompasses genomics, molecular simulation and dynamics, molecular docking, structural and/or functional class prediction, and quantitative structure-activity relationships. This article reviews progress in the development of computational methods, tools and databases used for organizing and extracting biological meaning from antimicrobial research. PMID:20546918

  17. An Online Bioinformatics Curriculum

    PubMed Central

    Searls, David B.

    2012-01-01

    Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field. PMID:23028269

  18. Bioinformatics and School Biology

    ERIC Educational Resources Information Center

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  19. Bioinformatic analysis reveals an evolutional selection for DNA:RNA hybrid G-quadruplex structures as putative transcription regulatory elements in warm-blooded animals.

    PubMed

    Xiao, Shan; Zhang, Jia-Yu; Zheng, Ke-Wei; Hao, Yu-Hua; Tan, Zheng

    2013-12-01

    Recently, we reported the co-transcriptional formation of DNA:RNA hybrid G-quadruplex (HQ) structure by the non-template DNA strand and nascent RNA transcript, which in turn modulates transcription under both in vitro and in vivo conditions. Here we present bioinformatic analysis on putative HQ-forming sequences (PHQS) in the genomes of eukaryotic organisms. Starting from amphibian, PHQS motifs are concentrated in the immediate 1000-nt region downstream of transcription start sites, implying their potential role in transcription regulation. Moreover, their occurrence shows a strong bias toward the non-template versus the template strand. PHQS has become constitutional in genes in warm-blooded animals, and the magnitude of the strand bias correlates with the ability of PHQS to form HQ, suggesting a selection based on HQ formation. This strand bias is reversed in lower species, implying that the selection of PHQS/HQ depended on the living temperature of the organisms. In comparison with the putative intramolecular G-quadruplex-forming sequences (PQS), PHQS motifs are far more prevalent and abundant in the transcribed regions, making them the dominant candidates in the formation of G-quadruplexes in transcription. Collectively, these results suggest that the HQ structures are evolutionally selected to function in transcription and other transcription-mediated processes that involve guanine-rich non-template strand. PMID:23999096

  20. Towards a career in bioinformatics

    PubMed Central

    2009-01-01

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508

  1. Bioinformatics pipeline for functional identification and characterization of proteins

    NASA Astrophysics Data System (ADS)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; Krzywkowski, Tomasz; Świerkula, Katarzyna; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    The new sequencing methods, called Next Generation Sequencing gives an opportunity to possess a vast amount of data in short time. This data requires structural and functional annotation. Functional identification and characterization of predicted proteins could be done by in silico approches, thanks to a numerous computational tools available nowadays. However, there is a need to confirm the results of proteins function prediction using different programs and comparing the results or confirm experimentally. Here we present a bioinformatics pipeline for structural and functional annotation of proteins.

  2. Bioinformatics for Exploration

    NASA Technical Reports Server (NTRS)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  3. BIOINFORMATIC INTEGRATION OF STRUCTURAL AND FUNCTIONAL GENOMICS DATA ACROSS SPECIES TO DEVELOP PORCINE INFLAMMATORY GENE REGULATORY PATHWAY INFORMATION

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Integration of structural and functional genomic data across species holds great promise in finding genes controlling disease resistance. We are investigating the porcine gut immune response to infection through gene expression profiling. We have collected porcine Affymetrix GeneChip data from RNA ...

  4. An Inquiry into Protein Structure and Genetic Disease: Introducing Undergraduates to Bioinformatics in a Large Introductory Course

    ERIC Educational Resources Information Center

    Bednarski, April E.; Elgin, Sarah C. R.; Pakrasi, Himadri B.

    2005-01-01

    This inquiry-based lab is designed around genetic diseases with a focus on protein structure and function. To allow students to work on their own investigatory projects, 10 projects on 10 different proteins were developed. Students are grouped in sections of 20 and work in pairs on each of the projects. To begin their investigation, students are…

  5. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  6. A Guide to Bioinformatics for Immunologists

    PubMed Central

    Whelan, Fiona J.; Yap, Nicholas V. L.; Surette, Michael G.; Golding, G. Brian; Bowdish, Dawn M. E.

    2013-01-01

    Bioinformatics includes a suite of methods, which are cheap, approachable, and many of which are easily accessible without any sort of specialized bioinformatic training. Yet, despite this, bioinformatic tools are under-utilized by immunologists. Herein, we review a representative set of publicly available, easy-to-use bioinformatic tools using our own research on an under-annotated human gene, SCARA3, as an example. SCARA3 shares an evolutionary relationship with the class A scavenger receptors, but preliminary research showed that it was divergent enough that its function remained unclear. In our quest for more information about this gene – did it share gene sequence similarities to other scavenger receptors? Did it contain conserved protein domains? Where was it expressed in the human body? – we discovered the power and informative potential of publicly available bioinformatic tools designed for the novice in mind, which allowed us to hypothesize on the regulation, structure, and function of this protein. We argue that these tools are largely applicable to many facets of immunology research. PMID:24363654

  7. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants.

    PubMed

    Huang, Ming-Der; Huang, Anthony H C

    2015-09-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins. PMID:26232488

  8. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants1[OPEN

    PubMed Central

    Huang, Ming-Der; Huang, Anthony H.C.

    2015-01-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins. PMID:26232488

  9. A Bioinformatics Reference Model: Towards a Framework for Developing and Organising Bioinformatic Resources

    NASA Astrophysics Data System (ADS)

    Hiew, Hong Liang; Bellgard, Matthew

    2007-11-01

    Life Science research faces the constant challenge of how to effectively handle an ever-growing body of bioinformatics software and online resources. The users and developers of bioinformatics resources have a diverse set of competing demands on how these resources need to be developed and organised. Unfortunately, there does not exist an adequate community-wide framework to integrate such competing demands. The problems that arise from this include unstructured standards development, the emergence of tools that do not meet specific needs of researchers, and often times a communications gap between those who use the tools and those who supply them. This paper presents an overview of the different functions and needs of bioinformatics stakeholders to determine what may be required in a community-wide framework. A Bioinformatics Reference Model is proposed as a basis for such a framework. The reference model outlines the functional relationship between research usage and technical aspects of bioinformatics resources. It separates important functions into multiple structured layers, clarifies how they relate to each other, and highlights the gaps that need to be addressed for progress towards a diverse, manageable, and sustainable body of resources. The relevance of this reference model to the bioscience research community, and its implications in progress for organising our bioinformatics resources, are discussed.

  10. Bioinformatics and genomic medicine.

    PubMed

    Kim, Ju Han

    2002-01-01

    Bioinformatics is a rapidly emerging field of biomedical research. A flood of large-scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational science. Clinical informatics has long developed methodologies to improve biomedical research and clinical care by integrating experimental and clinical information systems. The informatics revolution in both bioinformatics and clinical informatics will eventually change the current practice of medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high-throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever, in much the same way that biochemistry did a generation ago. This paper describes how these technologies will impact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics and proteomics. Basic data preprocessing with normalization and filtering, primary pattern analysis, and machine-learning algorithms are discussed. Use of integrative biochip informatics technologies, including multivariate data projection, gene-metabolic pathway mapping, automated biomolecular annotation, text mining of factual and literature databases, and the integrated management of biomolecular databases, are also discussed. PMID:12544491

  11. Pattern recognition in bioinformatics.

    PubMed

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained. PMID:23559637

  12. Improvement of Student Understanding of How Kinetic Data Facilitates the Determination of Amino Acid Catalytic Function through an Alkaline Phosphatase Structure/Mechanism Bioinformatics Exercise

    ERIC Educational Resources Information Center

    Grunwald, Sandra K.; Krueger, Katherine J.

    2008-01-01

    Laboratory exercises, which utilize alkaline phosphatase as a model enzyme, have been developed and used extensively in undergraduate biochemistry courses to illustrate enzyme steady-state kinetics. A bioinformatics laboratory exercise for the biochemistry laboratory, which complements the traditional alkaline phosphatase kinetics exercise, was…

  13. LXtoo: an integrated live Linux distribution for the bioinformatics community

    PubMed Central

    2012-01-01

    Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356

  14. A novel method to compare protein structures using local descriptors

    PubMed Central

    2011-01-01

    Background Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. Results We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy). Conclusions DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at http://bioexploratorium.pl/EP/DEDAL. PMID:21849047

  15. Fold assessment for comparative protein structure modeling.

    PubMed

    Melo, Francisco; Sali, Andrej

    2007-11-01

    Accurate and automated assessment of both geometrical errors and incompleteness of comparative protein structure models is necessary for an adequate use of the models. Here, we describe a composite score for discriminating between models with the correct and incorrect fold. To find an accurate composite score, we designed and applied a genetic algorithm method that searched for a most informative subset of 21 input model features as well as their optimized nonlinear transformation into the composite score. The 21 input features included various statistical potential scores, stereochemistry quality descriptors, sequence alignment scores, geometrical descriptors, and measures of protein packing. The optimized composite score was found to depend on (1) a statistical potential z-score for residue accessibilities and distances, (2) model compactness, and (3) percentage sequence identity of the alignment used to build the model. The accuracy of the composite score was compared with the accuracy of assessment by single and combined features as well as by other commonly used assessment methods. The testing set was representative of models produced by automated comparative modeling on a genomic scale. The composite score performed better than any other tested score in terms of the maximum correct classification rate (i.e., 3.3% false positives and 2.5% false negatives) as well as the sensitivity and specificity across the whole range of thresholds. The composite score was implemented in our program MODELLER-8 and was used to assess models in the MODBASE database that contains comparative models for domains in approximately 1.3 million protein sequences. PMID:17905832

  16. Virtual Bioinformatics Distance Learning Suite

    ERIC Educational Resources Information Center

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  17. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  18. Chapter 16: Text Mining for Translational Bioinformatics

    PubMed Central

    Cohen, K. Bretonnel; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research—translating basic science results into new interventions—and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  19. [Comparative hierarchic structure of the genetic language].

    PubMed

    Ratner, V A

    1993-05-01

    The genetical texts and genetic language are built according to hierarchic principle and contain no less than 6 levels of coding sequences, separated by marks of punctuation, separation and indication: codons, cistrons, scriptons, replicons, linkage groups, genomes. Each level has all the attributes of the language. This hierarchic system expresses some general properties and regularities. The rules of genetic language being determined, the variability of genetical texts is generated by block-modular combinatorics on each level. Between levels there are some intermediate sublevels and module types capable of being combined. The genetic language is compared with two different independent linguistic systems: human natural languages and artificial programming languages. Genetic language is a natural one by its origin, but it is a typical technical language of the functioning genetic regulatory system--by its predestination. All three linguistic systems under comparison have evident similarity of the organization principles and hierarchical structures. This argues for similarity of their principles of appearance and evolution. PMID:8335232

  20. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    ERIC Educational Resources Information Center

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  1. Microbial bioinformatics 2020.

    PubMed

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  2. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    PubMed Central

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  3. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    PubMed

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  4. Bioinformatic pipelines in Python with Leaf

    PubMed Central

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315

  5. Comparative Proteomic and Bioinformatic Analysis of the Effects of a High-Grain Diet on the Hepatic Metabolism in Lactating Dairy Goats

    PubMed Central

    Jiang, Xueyuan; Zeng, Tao; Zhang, Shukun; Zhang, Yuanshu

    2013-01-01

    To gain insight on the impart of high-grain diets on liver metabolism in ruminants, we employed a comparative proteomic approach to investigate the proteome-wide effects of diet in lactating dairy goats by conducting a proteomic analysis of the liver extracts of 10 lactating goats fed either a control diet or a high-grain diet. More than 500 protein spots were detected per condition by two-dimensional electrophoresis (2-DE). In total, 52 differentially expressed spots (≥2.0-fold changed) were excised and analyzed using MALDI TOF/TOF. Fifty-one protein spots were successfully identified. Of these, 29 proteins were upregulated, while 22 were downregulated in the high-grain fed vs. control animals. Differential expressions of proteins including alpha enolase, elongation factor 2, calreticulin, cytochrome b5, apolipoprotein A-I, catalase, was verified by mRNA analysis and/or Western blotting. Database searches combined with Gene Ontology (GO) analysis and KEGG pathway analysis revealed that the high-grain diet resulted in altered expression of proteins related to amino acids metabolism. These results suggest new candidate proteins that may contribute to a better understanding of the signaling pathways and mechanisms that mediate liver adaptation to high-grain diet. PMID:24260456

  6. Comparing Factor Structures of Adolescent Psychopathology

    ERIC Educational Resources Information Center

    Verona, Edelyn; Javdani, Shabnam; Sprague, Jenessa

    2011-01-01

    Research on the structure of adolescent psychopathology can provide information on broad factors that underlie different forms of maladjustment in youths. Multiple studies from the literature on adult populations suggest that 2 factors, Internalizing and Externalizing, meaningfully comprise the factor structure of adult psychopathology (e.g.,…

  7. Comparative BioInformatics and Computational Toxicology

    EPA Science Inventory

    Reflecting the numerous changes in the field since the publication of the previous edition, this third edition of Developmental Toxicology focuses on the mechanisms of developmental toxicity and incorporates current technologies for testing in the risk assessment process.

  8. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  9. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  10. Reactance, Restoration, and Cognitive Structure: Comparative Statics

    ERIC Educational Resources Information Center

    Bessarabova, Elena; Fink, Edward L.; Turner, Monique

    2013-01-01

    This study (N = 143) examined the effects of freedom threat on cognitive structures, using recycling as its topic. The results of a 2(Freedom Threat: low vs. high) x 2(Postscript: restoration vs. filler) plus 1(Control) experiment indicated that, relative to the control condition, high freedom threat created a boomerang effect for the targeted…

  11. Uncertainty of Comparative Judgments and Multidimensional Structure

    ERIC Educational Resources Information Center

    Sjoberg, Lennart

    1975-01-01

    An analysis of preferences with respect to silhouette drawings of nude females is presented. Systematic intransitivities were discovered. The dispersions of differences (comparatal dispersons) were shown to reflect the multidimensional structure of the stimuli, a finding expected on the basis of prior work. (Author)

  12. Compare, Contrast, Comprehend: Using Compare-Contrast Text Structures with ELLs in K-3 Classrooms

    ERIC Educational Resources Information Center

    Dreher, Mariam Jean; Gray, Jennifer Letcher

    2009-01-01

    In this article, we describe how to help primary-grade English language learners use compare-contrast text structures. Specifically, we explain (a) how to teach students to identify the compare-contrast text structure, and to use this structure to support their comprehension, (b) how to use compare-contrast texts to activate and extend students'…

  13. Intrageneric primer design: Bringing bioinformatics tools to the class.

    PubMed

    Lima, André O S; Garcês, Sérgio P S

    2006-09-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private and academic) with a need for bachelor of science students with bioinformatics skills. In consideration of this need, described here is a problem-based class in which students are asked to design a set of intrageneric primers for PCR. The exercise is divided into five classes of 1 h each, in which students use freeware bioinformatics tools and data bases available through the Internet. Besides designing the set of primers, the students will consequently learn the significance and use of the major bioinformatics procedures, such as searching a data base, conducting and analyzing sequence multialignment, comparing sequences with a data base, and selecting primers. PMID:21638710

  14. Adapting bioinformatics curricula for big data.

    PubMed

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  15. Adapting bioinformatics curricula for big data

    PubMed Central

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  16. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides.

    PubMed

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe; Pupin, Maude

    2016-01-01

    This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two-dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i.e., antibiotics against multi-resistant pathogens or anti-tumors). PMID:26831711

  17. Bioinformatics Approach in Plant Genomic Research.

    PubMed

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  18. Which craft is best in bioinformatics?

    PubMed

    Attwood, T K; Miller, C J

    2001-07-01

    'Silicon-based' biology has gathered momentum as the world-wide sequencing projects have made possible the investigation and comparative analysis of complete genomes. Central to the quest to elucidate and characterise the genes and gene products encoded within genomes are pivotal concepts concerning the processes of evolution, the mechanisms of protein folding, and, crucially, the manifestation of protein function. Our use of computers to model such concepts is limited by, and must be placed in the context of, the current limits of our understanding of these biological processes. It is important to recognise that we do not have a common understanding of what constitutes a gene; we cannot invariably say that a particular sequence or fold has arisen via divergence or convergence; we do not fully understand the rules of protein folding, so we cannot predict protein structure; and we cannot invariably diagnose protein function, given knowledge only of its sequence or structure in isolation. Accepting what we cannot do with computers plays an essential role in forming an appreciation of what we can do. Without this understanding, it is easy to be misled, as spurious arguments are often used to promote over-enthusiastic notions of what particular programs can achieve. There are valuable lessons to be learned here from the field of artificial intelligence, principal among which is the realisation that capturing and representing complex knowledge is time consuming, expensive and hard. If bioinformatics is to tackle biological complexity meaningfully, the road ahead must therefore be paved with caution, rigour and pragmatism. PMID:11459349

  19. Generations of interdisciplinarity in bioinformatics

    PubMed Central

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  20. Clinical Bioinformatics: challenges and opportunities

    PubMed Central

    2012-01-01

    Background Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of -omics information. Conclusions Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput "-omics" technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research. PMID:23095472

  1. Bioinformatics and the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  2. Reproducible Bioinformatics Research for Biologists

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  3. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  4. Teaching bioinformatics in concert.

    PubMed

    Goodman, Anya L; Dekhtyar, Alex

    2014-11-01

    Can biology students without programming skills solve problems that require computational solutions? They can if they learn to cooperate effectively with computer science students. The goal of the in-concert teaching approach is to introduce biology students to computational thinking by engaging them in collaborative projects structured around the software development process. Our approach emphasizes development of interdisciplinary communication and collaboration skills for both life science and computer science students. PMID:25411792

  5. Bioinformatics and molecular modeling in glycobiology

    PubMed Central

    Schloissnig, Siegfried

    2010-01-01

    The field of glycobiology is concerned with the study of the structure, properties, and biological functions of the family of biomolecules called carbohydrates. Bioinformatics for glycobiology is a particularly challenging field, because carbohydrates exhibit a high structural diversity and their chains are often branched. Significant improvements in experimental analytical methods over recent years have led to a tremendous increase in the amount of carbohydrate structure data generated. Consequently, the availability of databases and tools to store, retrieve and analyze these data in an efficient way is of fundamental importance to progress in glycobiology. In this review, the various graphical representations and sequence formats of carbohydrates are introduced, and an overview of newly developed databases, the latest developments in sequence alignment and data mining, and tools to support experimental glycan analysis are presented. Finally, the field of structural glycoinformatics and molecular modeling of carbohydrates, glycoproteins, and protein–carbohydrate interaction are reviewed. PMID:20364395

  6. Provenance in bioinformatics workflows

    PubMed Central

    2013-01-01

    In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can access details of one particular execution of a workflow, compare results produced by different executions, and plan new experiments more efficiently. In addition to this, a provenance simulator was created, which facilitates the inclusion of provenance data of one genome project workflow execution. Finally, we discuss one case study, which aims to identify genes involved in specific metabolic pathways of Bacillus cereus, as well as to compare this isolate with other phylogenetic related bacteria from the Bacillus group. B. cereus is an extremophilic bacteria, collected in warm water in the Midwestern Region of Brazil, its DNA samples having been sequenced with an NGS machine. PMID:24564294

  7. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    PubMed Central

    Obom, Kristina M.; Cummings, Patrick J.

    2007-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation. PMID:23653816

  8. Bioinformatics in the information age

    SciTech Connect

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  9. Detecting evolution of bioinformatics with a content and co-authorship analysis.

    PubMed

    Song, Min; Yang, Christopher C; Tang, Xuning

    2013-12-01

    Bioinformatics is an interdisciplinary research field that applies advanced computational techniques to biological data. Bibliometrics analysis has recently been adopted to understand the knowledge structure of a research field by citation pattern. In this paper, we explore the knowledge structure of Bioinformatics from the perspective of a core open access Bioinformatics journal, BMC Bioinformatics with trend analysis, the content and co-authorship network similarity, and principal component analysis. Publications in four core journals including Bioinformatics - Oxford Journal and four conferences in Bioinformatics were harvested from DBLP. After converting publications into TF-IDF term vectors, we calculate the content similarity, and we also calculate the social network similarity based on the co-authorship network by utilizing the overlap measure between two co-authorship networks. Key terms is extracted and analyzed with PCA, visualization of the co-authorship network is conducted. The experimental results show that Bioinformatics is fast-growing, dynamic and diversified. The content analysis shows that there is an increasing overlap among Bioinformatics journals in terms of topics and more research groups participate in researching Bioinformatics according to the co-authorship network similarity. PMID:23710427

  10. Receptor-binding sites: bioinformatic approaches.

    PubMed

    Flower, Darren R

    2006-01-01

    It is increasingly clear that both transient and long-lasting interactions between biomacromolecules and their molecular partners are the most fundamental of all biological mechanisms and lie at the conceptual heart of protein function. In particular, the protein-binding site is the most fascinating and important mechanistic arbiter of protein function. In this review, I examine the nature of protein-binding sites found in both ligand-binding receptors and substrate-binding enzymes. I highlight two important concepts underlying the identification and analysis of binding sites. The first is based on knowledge: when one knows the location of a binding site in one protein, one can "inherit" the site from one protein to another. The second approach involves the a priori prediction of a binding site from a sequence or a structure. The full and complete analysis of binding sites will necessarily involve the full range of informatic techniques ranging from sequence-based bioinformatic analysis through structural bioinformatics to computational chemistry and molecular physics. Integration of both diverse experimental and diverse theoretical approaches is thus a mandatory requirement in the evaluation of binding sites and the binding events that occur within them. PMID:16671408

  11. A Bioinformatics Facility for NASA

    NASA Technical Reports Server (NTRS)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  12. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  13. Bioinformatic Insights from Metagenomics through Visualization

    SciTech Connect

    Havre, Susan L.; Webb-Robertson, Bobbie-Jo M.; Shah, Anuj; Posse, Christian; Gopalan, Banu; Brockman, Fred J.

    2005-08-10

    Revised abstract: (remove current and replace with this) Cutting-edge biological and bioinformatics research seeks a systems perspective through the analysis of multiple types of high-throughput and other experimental data for the same sample. Systems-level analysis requires the integration and fusion of such data, typically through advanced statistics and mathematics. Visualization is a complementary com-putational approach that supports integration and analysis of complex data or its derivatives. We present a bioinformatics visualization prototype, Juxter, which depicts categorical information derived from or assigned to these diverse data for the purpose of comparing patterns across categorizations. The visualization allows users to easily discern correlated and anomalous patterns in the data. These patterns, which might not be detected automatically by algorithms, may reveal valuable information leading to insight and discovery. We describe the visualization and interaction capabilities and demonstrate its utility in a new field, metagenomics, which combines molecular biology and genetics to identify and characterize genetic material from multi-species microbial samples.

  14. Bioinformatics analysis of the epitope regions for norovirus capsid protein

    PubMed Central

    2013-01-01

    Background Norovirus is the major cause of nonbacterial epidemic gastroenteritis, being highly prevalent in both developing and developed countries. Despite of the available monoclonal antibodies (MAbs) for different sub-genogroups, a comprehensive epitope analysis based on various bioinformatics technology is highly desired for future potential antibody development in clinical diagonosis and treatment. Methods A total of 18 full-length human norovirus capsid protein sequences were downloaded from GenBank. Protein modeling was performed with program Modeller 9.9. The modeled 3D structures of capsid protein of norovirus were submitted to the protein antigen spatial epitope prediction webserver (SEPPA) for predicting the possible spatial epitopes with the default threshold. The results were processed using the Biosoftware. Results Compared with GI, we found that the GII genogroup had four deletions and two special insertions in the VP1 region. The predicted conformational epitope regions mainly concentrated on N-terminal (1~96), Middle Part (298~305, 355~375) and C-terminal (560~570). We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. Conclusions The predicted conformational epitope regions of norovirus VP1 mainly concentrated on N-terminal, Middle Part and C-terminal. We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. The overlapping with experimental epitopes indicates the important role of latest computational technologies. With the fast development of computational immunology tools, the bioinformatics pipeline will be more and more critical to vaccine design. PMID:23514273

  15. Bioinformatics in Africa: The Rise of Ghana?

    PubMed Central

    Karikari, Thomas K.

    2015-01-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  16. Bioinformatics in Africa: The Rise of Ghana?

    PubMed

    Karikari, Thomas K

    2015-09-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  17. Comparative modeling of InP solar cell structures

    NASA Technical Reports Server (NTRS)

    Jain, R. K.; Weinberg, I.; Flood, D. J.

    1991-01-01

    The comparative modeling of p(+)n and n(+)p indium phosphide solar cell structures is studied using a numerical program PC-1D. The optimal design study has predicted that the p(+)n structure offers improved cell efficiencies as compared to n(+)p structure, due to higher open-circuit voltage. The various cell material and process parameters to achieve the maximum cell efficiencies are reported. The effect of some of the cell parameters on InP cell I-V characteristics was studied. The available radiation resistance data on n(+)p and p(+)p InP solar cells are also critically discussed.

  18. Structural analysis of the PsbQ protein of photosystem II by Fourier transform infrared and circular dichroic spectroscopy and by bioinformatic methods.

    PubMed

    Balsera, Mónica; Arellano, Juan B; Gutiérrez, José R; Heredia, Pedro; Revuelta, José L; De Las Rivas, Javier

    2003-02-01

    The structure of PsbQ, one of the three main extrinsic proteins associated with the oxygen-evolving complex (OEC) of higher plants and green algae, is examined by Fourier transform infrared (FTIR) and circular dichroic (CD) spectroscopy and by computational structural prediction methods. This protein, together with two other lumenally bound extrinsic proteins, PsbO and PsbP, is essential for the stability and full activity of the OEC in plants. The FTIR spectra obtained in both H(2)O and D(2)O suggest a mainly alpha-helix structure on the basis of the relative areas of the constituents of the amide I and I' bands. The FTIR quantitative analyses indicate that PsbQ contains about 53% alpha-helix, 7% turns, 14% nonordered structure, and 24% beta-strand plus other beta-type extended structures. CD analyses indicate that PsbQ is a mainly alpha-helix protein (about 64%), presenting a small percentage assigned to beta-strand ( approximately 7%) and a larger amount assigned to turns and nonregular structures ( approximately 29%). Independent of the spectroscopic analyses, computational methods for protein structure prediction of PsbQ were utilized. First, a multiple alignment of 12 sequences of PsbQ was obtained after an extensive search in the public databases for protein and EST sequences. Based on this alignment, computational prediction of the secondary structure and the solvent accessibility suggest the presence of two different structural domains in PsbQ: a major C-terminal domain containing four alpha-helices and a minor N-terminal domain with a poorly defined secondary structure enriched in proline and glycine residues. The search for PsbQ analogues by fold recognition methods, not based on the secondary structure, also indicates that PsbQ is a four alpha-helix protein, most probably folding as an up-down bundle. The results obtained by both the spectroscopic and computational methods are in agreement, all indicating that PsbQ is mainly an alpha protein, and show

  19. Transregional zones of concentrated deformation: Structure, evolution, and comparative geodynamics

    NASA Astrophysics Data System (ADS)

    Leonov, M. G.

    2016-03-01

    The comparative tectonic characterization of transregional linear structures (zones of concentrated deformations) is given for the Pieniny Klippen Belt, the Main Mongolian Lineament, and the transregional Alpine Fault Zone. They represent significant geodynamic elements of the Earth's crust, which separate large crustal segments and reflect their interaction in time and space. The main features of the structure, evolution, and geodynamics inherent to zones of concentrated deformations are described. It is shown that the similarity of their outlines, morphology, internal structure, and kinematic features is combined with a clearly distinct structural position, set of rock associations, formation mechanism, and their role in the origin of mobile belts.

  20. DSSTOX STRUCTURE-SEARCHABLE PUBLIC TOXICITY DATABASE NETWORK: CURRENT PROGRESS AND NEW INITIATIVES TO IMPROVE CHEMO-BIOINFORMATICS CAPABILITIES

    EPA Science Inventory

    The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...

  1. Comparative structural studies on Lys49-phospholipases A(2) from Bothrops genus reveal their myotoxic site.

    PubMed

    dos Santos, Juliana I; Soares, Andreimar Martins; Fontes, Marcos R M

    2009-08-01

    Phospholipases A(2) (PLA(2)s) are membrane-associated enzymes that hydrolyze phospholipids at the sn-2 position, releasing lysophospholipids and free fatty acids. Phospholipase A(2) homologues (Lys49-PLA(2)s) are highly myotoxic and cause extensive tissue damage despite not showing measurable catalytic activity. They are found in different snake venoms and represent one third of bothropic venom composition. The importance of these toxins during envenomation is related to the pronounced local myotoxic effect they induce since this effect is not neutralized by serum therapy. We present herein three structures of Lys49-PLA(2)s from Bothrops genus snake venom crystallized under the same conditions, two of which were grown in the presence of alpha-tocopherol (vitamin E). Comparative structural analysis of these and other Lys49-PLA(2)s showed two different patterns of oligomeric conformation that are related to the presence or absence of ligands in the hydrophobic channel. This work also confirms the biological dimer indicated by recent studies in which both C-termini are in the dimeric interface. In this configuration, we propose that the myotoxic site of these toxins is composed by the Lys 20, Lys115 and Arg118 residues. For the first time, a residue from the short-helix (Lys20) is suggested as a member of this site and the importance of Tyr119 residue to myotoxicity of bothropic Lys49-PLA(2)s is also discussed. These results support a complete hypothesis for these PLA(2)s myotoxic activity consistent with all findings on bothropic Lys49-PLA(2)s studied up to this moment, including crystallographic, bioinformatics, biochemical and biophysical data. PMID:19401234

  2. The European Bioinformatics Institute's data resources 2014.

    PubMed

    Brooksbank, Catherine; Bergman, Mary Todd; Apweiler, Rolf; Birney, Ewan; Thornton, Janet

    2014-01-01

    Molecular Biology has been at the heart of the 'big data' revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff's 'Atlas of Protein Sequence and Structure' through the Human Genome Project in the late 1990s and early 2000s to today's population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI's database collection to complement the reviews of individual databases provided elsewhere in this issue. PMID:24271396

  3. Bioinformatics by Example: From Sequence to Target

    NASA Astrophysics Data System (ADS)

    Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

    2002-12-01

    With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.

  4. Bioinformatics for Next Generation Sequencing Data

    PubMed Central

    Magi, Alberto; Benelli, Matteo; Gozzini, Alessia; Girolami, Francesca; Torricelli, Francesca; Brandi, Maria Luisa

    2010-01-01

    The emergence of next-generation sequencing (NGS) platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow. PMID:24710047

  5. Comparative genomics for understanding the structure, function and sub-cellular localization of hypothetical proteins in Thermanerovibrio acidaminovorans DSM 6589 (tai).

    PubMed

    Thakare, Hitesh S; Meshram, Dilip B; Jangam, Chandrakant M; Labhasetwar, Pawan; Roychoudhary, Kunal; Ingle, Arun B

    2016-04-01

    The Thermanerovibrio acidaminovorans DSM 6589 (tai) is a unique bacterium isolated from anaerobic sludge bed reactor from sugar refinery in Netherland. The comparative genomic studies for understanding the hypothetical proteins in T. acidaminovorans DSM 6589 (tai) were carried out using different bioinformatic tools and web servers. In all 320 hypothetical proteins were screened from the total available genome. The Insilico function prediction for 320 hypothetical proteins was achieved by using different online servers like CDD-Blast, Interproscan and pfam whereas, the structure prediction for 202 hypothetical proteins were deciphered by using protein structure prediction server (PS2 server). The sub-cellular localization for the identified proteins was predicted by the use of cello v2.5 for 320. The study carried out has helped us to understand the structures and functions of unknown proteins available in T. acidaminovorans DSM 6589 (tai) through comparative genomic approach. PMID:26930563

  6. Evolving Strategies for the Incorporation of Bioinformatics Within the Undergraduate Cell Biology Curriculum

    PubMed Central

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489

  7. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  8. Postgenomics: Proteomics and Bioinformatics in Cancer Research

    PubMed Central

    2003-01-01

    Now that the human genome is completed, the characterization of the proteins encoded by the sequence remains a challenging task. The study of the complete protein complement of the genome, the “proteome,” referred to as proteomics, will be essential if new therapeutic drugs and new disease biomarkers for early diagnosis are to be developed. Research efforts are already underway to develop the technology necessary to compare the specific protein profiles of diseased versus nondiseased states. These technologies provide a wealth of information and rapidly generate large quantities of data. Processing the large amounts of data will lead to useful predictive mathematical descriptions of biological systems which will permit rapid identification of novel therapeutic targets and identification of metabolic disorders. Here, we present an overview of the current status and future research approaches in defining the cancer cell's proteome in combination with different bioinformatics and computational biology tools toward a better understanding of health and disease. PMID:14615629

  9. Comparative testing of nondestructive examination techniques for concrete structures

    NASA Astrophysics Data System (ADS)

    Clayton, Dwight A.; Smith, Cyrus M.

    2014-03-01

    A multitude of concrete-based structures are typically part of a light water reactor (LWR) plant to provide foundation, support, shielding, and containment functions. Concrete has been used in the construction of nuclear power plants (NPPs) because of three primary properties, its inexpensiveness, its structural strength, and its ability to shield radiation. Examples of concrete structures important to the safety of LWR plants include containment building, spent fuel pool, and cooling towers. Comparative testing of the various NDE concrete measurement techniques requires concrete samples with known material properties, voids, internal microstructure flaws, and reinforcement locations. These samples can be artificially created under laboratory conditions where the various properties can be controlled. Other than NPPs, there are not many applications where critical concrete structures are as thick and reinforced. Therefore, there are not many industries other than the nuclear power plant or power plant industry that are interested in performing NDE on thick and reinforced concrete structures. This leads to the lack of readily available samples of thick and heavily reinforced concrete for performing NDE evaluations, research, and training. The industry that typically performs the most NDE on concrete structures is the bridge and roadway industry. While bridge and roadway structures are thinner and less reinforced, they have a good base of NDE research to support their field NDE programs to detect, identify, and repair concrete failures. This paper will summarize the initial comparative testing of two concrete samples with an emphasis on how these techniques could perform on NPP concrete structures.

  10. Structure and comparative morphology of camptotrichia of lungfish fins.

    PubMed

    Geraudie, J; Meunier, F J

    1984-01-01

    The present work is devoted to the organization and ultrastructure of the fin rays or camptotrichia of two living Dipnoi (lungfishes) Protopterus and Neoceratodus. In both species, these rods have a dual structure: only the superficial region facing the stratified epidermis is mineralized while the deep one is made of a dense unmineralized network of collagen fibrils forming a permanent pre-osseous tissue. Only the camptotrichia of Neoceratodus is made of cellular bone. This study confirms the structural peculiarities of these camptotrichia when compared to the dermal skeleton of the Actinopterygii constituted by the bony lepidotrichia and the actinotrichia. These results are discussed and compared to fossil dipnoan fin rays. PMID:6740649

  11. Structural Bioinformatics and Protein Docking Analysis of the Molecular Chaperone-Kinase Interactions: Towards Allosteric Inhibition of Protein Kinases by Targeting the Hsp90-Cdc37 Chaperone Machinery

    PubMed Central

    Lawless, Nathan; Blacklock, Kristin; Berrigan, Elizabeth; Verkhivker, Gennady

    2013-01-01

    A fundamental role of the Hsp90-Cdc37 chaperone system in mediating maturation of protein kinase clients and supporting kinase functional activity is essential for the integrity and viability of signaling pathways involved in cell cycle control and organism development. Despite significant advances in understanding structure and function of molecular chaperones, the molecular mechanisms and guiding principles of kinase recruitment to the chaperone system are lacking quantitative characterization. Structural and thermodynamic characterization of Hsp90-Cdc37 binding with protein kinase clients by modern experimental techniques is highly challenging, owing to a transient nature of chaperone-mediated interactions. In this work, we used experimentally-guided protein docking to probe the allosteric nature of the Hsp90-Cdc37 binding with the cyclin-dependent kinase 4 (Cdk4) kinase clients. The results of docking simulations suggest that the kinase recognition and recruitment to the chaperone system may be primarily determined by Cdc37 targeting of the N-terminal kinase lobe. The interactions of Hsp90 with the C-terminal kinase lobe may provide additional “molecular brakes” that can lock (or unlock) kinase from the system during client loading (release) stages. The results of this study support a central role of the Cdc37 chaperone in recognition and recruitment of the kinase clients. Structural analysis may have useful implications in developing strategies for allosteric inhibition of protein kinases by targeting the Hsp90-Cdc37 chaperone machinery. PMID:24287464

  12. Modelling and bioinformatics analysis of the dimeric structure of house dust mite allergens from families 5 and 21: Der f 5 could dimerize as Der p 5.

    PubMed

    Khemili, Souad; Kwasigroch, Jean Marc; Hamadouche, Tarik; Gilis, Dimitri

    2012-01-01

    Allergy represents an increasing thread to public health in both developed and emerging countries and the dust mites Dermatophagoides pteronyssinus (Der p), Blomia tropicalis (Blo t), Dermatophagoides farinae (Der f), Lepidoglyphus destructor (Lep d) and Suidasia medanensis (Sui m) strongly contribute to this problem. Their allergens are classified in several families among which families 5 and 21 which are the subject of this work. Indeed, their biological function as well as the mechanism or epitopes by which they are contributing to the allergic response remain unknown and their tridimensional structures have not been resolved experimentally except for Blo t 5 and Der p 5. Blo t 5 is a monomeric three helical bundle, whereas Der p 5 shows a three helical bundle with a kinked N-terminal helix that assembles in an entangled dimeric structure with a large hydrophobic cavity. This cavity could be involved in the binding of hydrophobic ligands, which in turn could be responsible for the shift of the immune response from tolerance to allergic inflammation. We used molecular modelling approaches to bring out if other house dust mite allergens of families 5 and 21 (Der f 5, Sui m 5, Lep d 5, Der p 21 and Der f 21) could dimerize and form a large cavity in the same way as Der p 5. Monomeric models were first performed with MODELLER using the experimental structures of Der p 5 and Blo t 5 as templates. The ClusPro server processed the selected monomers in order to assess their capacity to form dimeric structures with a positive result for Der p 5 and Der f 5 only. The other allergens (Blo t 5, Sui m 5, Lep d 5, Der p 21 and Der f 21) did not present such a propensity. Moreover, we identified mutations that should destabilize and/or prevent the formation of the Der p 5 dimeric structure. The production of these mutated proteins could help us to understand the role of the dimerization process in the allergic response induced by Der p 5, and if Der p 5 and Der f 5 behave

  13. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Cancer.gov

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  14. Rapid Development of Bioinformatics Education in China

    ERIC Educational Resources Information Center

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  15. Biology in 'silico': The Bioinformatics Revolution.

    ERIC Educational Resources Information Center

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  16. A Mathematical Optimization Problem in Bioinformatics

    ERIC Educational Resources Information Center

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  17. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  18. Fuzzy Logic in Medicine and Bioinformatics

    PubMed Central

    Torres, Angela; Nieto, Juan J.

    2006-01-01

    The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions) and in bioinformatics (comparison of genomes). PMID:16883057

  19. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    PubMed

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  20. Mathematics and evolutionary biology make bioinformatics education comprehensible

    PubMed Central

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  1. Technical phosphoproteomic and bioinformatic tools useful in cancer research

    PubMed Central

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  2. Bioinformatics clouds for big data manipulation

    PubMed Central

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475

  3. Design and bioinformatics analysis of genome-wide CLIP experiments

    PubMed Central

    Wang, Tao; Xiao, Guanghua; Chu, Yongjun; Zhang, Michael Q.; Corey, David R.; Xie, Yang

    2015-01-01

    The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP–RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses. PMID:25958398

  4. Accuracy of functional surfaces on comparatively modeled protein structures

    PubMed Central

    Zhao, Jieling; Dundas, Joe; Kachalo, Sema; Ouyang, Zheng; Liang, Jie

    2012-01-01

    Identification and characterization of protein functional surfaces are important for predicting protein function, understanding enzyme mechanism, and docking small compounds to proteins. As the rapid speed of accumulation of protein sequence information far exceeds that of structures, constructing accurate models of protein functional surfaces and identify their key elements become increasingly important. A promising approach is to build comparative models from sequences using known structural templates such as those obtained from structural genome projects. Here we assess how well this approach works in modeling binding surfaces. By systematically building three-dimensional comparative models of proteins using Modeller, we determine how well functional surfaces can be accurately reproduced. We use an alpha shape based pocket algorithm to compute all pockets on the modeled structures, and conduct a large-scale computation of similarity measurements (pocket RMSD and fraction of functional atoms captured) for 26,590 modeled enzyme protein structures. Overall, we find that when the sequence fragment of the binding surfaces has more than 45% identity to that of the tempalte protein, the modeled surfaces have on average an RMSD of 0.5 Å, and contain 48% or more of the binding surface atoms, with nearly all of the important atoms in the signatures of binding pockets captured. PMID:21541664

  5. Computational Biology and Bioinformatics in Nigeria

    PubMed Central

    Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-01-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310

  6. Bioinformatic challenges in targeted proteomics.

    PubMed

    Reker, Daniel; Malmström, Lars

    2012-09-01

    Selected reaction monitoring mass spectrometry is an emerging targeted proteomics technology that allows for the investigation of complex protein samples with high sensitivity and efficiency. It requires extensive knowledge about the sample for the many parameters needed to carry out the experiment to be set appropriately. Most studies today rely on parameter estimation from prior studies, public databases, or from measuring synthetic peptides. This is efficient and sound, but in absence of prior data, de novo parameter estimation is necessary. Computational methods can be used to create an automated framework to address this problem. However, the number of available applications is still small. This review aims at giving an orientation on the various bioinformatical challenges. To this end, we state the problems in classical machine learning and data mining terms, give examples of implemented solutions and provide some room for alternatives. This will hopefully lead to an increased momentum for the development of algorithms and serve the needs of the community for computational methods. We note that the combination of such methods in an assisted workflow will ease both the usage of targeted proteomics in experimental studies as well as the further development of computational approaches. PMID:22866949

  7. Identifiying human MHC supertypes using bioinformatic methods.

    PubMed

    Doytchinova, Irini A; Guan, Pingping; Flower, Darren R

    2004-04-01

    Classification of MHC molecules into supertypes in terms of peptide-binding specificities is an important issue, with direct implications for the development of epitope-based vaccines with wide population coverage. In view of extremely high MHC polymorphism (948 class I and 633 class II HLA alleles) the experimental solution of this task is presently impossible. In this study, we describe a bioinformatics strategy for classifying MHC molecules into supertypes using information drawn solely from three-dimensional protein structure. Two chemometric techniques-hierarchical clustering and principal component analysis-were used independently on a set of 783 HLA class I molecules to identify supertypes based on structural similarities and molecular interaction fields calculated for the peptide binding site. Eight supertypes were defined: A2, A3, A24, B7, B27, B44, C1, and C4. The two techniques gave 77% consensus, i.e., 605 HLA class I alleles were classified in the same supertype by both methods. The proposed strategy allowed "supertype fingerprints" to be identified. Thus, the A2 supertype fingerprint is Tyr(9)/Phe(9), Arg(97), and His(114) or Tyr(116); the A3-Tyr(9)/Phe(9)/Ser(9), Ile(97)/Met(97) and Glu(114) or Asp(116); the A24-Ser(9) and Met(97); the B7-Asn(63) and Leu(81); the B27-Glu(63) and Leu(81); for B44-Ala(81); the C1-Ser(77); and the C4-Asn(77). PMID:15034046

  8. PATRIC, the bacterial bioinformatics database and analysis resource

    PubMed Central

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  9. Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens

    PubMed Central

    Thomas, Reuben; Phuong, Jimmy; McHale, Cliona M.; Zhang, Luoping

    2012-01-01

    We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other. PMID:22851955

  10. Bioinformatics and the undergraduate curriculum essay.

    PubMed

    Maloney, Mark; Parker, Jeffrey; Leblanc, Mark; Woodard, Craig T; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of bioinformatics as a new discipline has challenged many colleges and universities to keep current with their curricula, often in the face of static or dwindling resources. On the plus side, many bioinformatics modules and related databases and software programs are free and accessible online, and interdisciplinary partnerships between existing faculty members and their support staff have proved advantageous in such efforts. We present examples of strategies and methods that have been successfully used to incorporate bioinformatics content into undergraduate curricula. PMID:20810947

  11. Bioinformatics in Italy: BITS2011, the Eighth Annual Meeting of the Italian Society of Bioinformatics

    PubMed Central

    2012-01-01

    The BITS2011 meeting, held in Pisa on June 20-22, 2011, brought together more than 120 Italian researchers working in the field of Bioinformatics, as well as students in Bioinformatics, Computational Biology, Biology, Computer Sciences, and Engineering, representing a landscape of Italian bioinformatics research. This preface provides a brief overview of the meeting and introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement. PMID:22536954

  12. Indentification and Analysis of Occludin Phosphosites: A Combined Mass Spectroscoy and Bioinformatics Approach

    SciTech Connect

    Sundstrom, J.; Tash, B; Murakami, T; Flanagan, J; Bewley, M; Stanley, B; Gonsar, K; Antonetti, D

    2009-01-01

    The molecular function of occludin, an integral membrane component of tight junctions, remains unclear. VEGF-induced phosphorylation sites were mapped on occludin by combining MS data analysis with bioinformatics. In vivo phosphorylation of Ser490 was validated and protein interaction studies combined with crystal structure analysis suggest that Ser490 phosphorylation attenuates the interaction between occludin and ZO-1. This study demonstrates that combining MS data and bioinformatics can successfully identify novel phosphorylation sites from limiting samples.

  13. Comparative study of medium damped and detuned linear accelerator structures

    SciTech Connect

    Jean-Francois Ostiguy et al.

    2001-08-22

    Long range wakefields are a serious concern for a future linear collider based on room temperature accelerating structures. They can be suppressed either by detuning and or local damping or with some combination of both strategies. Detuning relies on precisely phasing the contributions of the dipole modes excited by the passage of a single bunch. This is accomplished by controlling individual mode frequencies, a process which dictates individual cell dimensional tolerances. Each mode must be excited with the correct strength; this in turn, determines cell-to-cell alignment tolerances. In contrast, in a locally damped structure, the modes are attenuated at the cell level. Clearly, mode frequencies and relative excitation become less critical in that context; mechanical fabrication tolerances can be relaxed. While local damping is ideal from the stand-point of long range wakefield suppression, this comes at the cost of reducing the shunt impedance and possibly unacceptable localized heating. Recently, the Medium Damped Structure (MDS), a compromise between detuning and local damping, has generated some interest. In this paper, we compare a hypothetical MDS to the NLC Rounded Damped Detuned Structure (RDDS) and investigate possible advantages from the standpoint fabrication tolerances and their relation to beam stability and emittance preservation.

  14. A comparative structural study of wet and dried ettringite

    SciTech Connect

    Renaudin, G.; Filinchuk, Y.; Neubauer, J.; Goetz-Neunhoeffer, F.

    2010-03-15

    Two different techniques were used to compare structural characteristics of 'wet' ettringite (stored in the synthesis mother liquid) and 'dried' ettringite (dried to 35% relative humidity over saturated CaCl{sub 2} solution). Lattice parameters and the water content in the channel region of the structure (site occupancy factor of the water molecule not bonded to cations) as well as microstructure parameters (size and strain) were determined from a Rietveld refinement on synchrotron powder diffraction data. Local environment of sulphate anions and of the hydrogen bonding network was characterized by Raman spectroscopy. Both techniques led to the same conclusion: the 'wet' ettringite sample immersed in the mother solution from the synthesis presents similar structural features as ettringite dried to 35% relative humidity. An increase of the a lattice parameter combined with a decrease of the c lattice parameter occurs on drying. The amount of structural water, the point symmetry of sulphate and the hydrogen bond network are unchanged when passing from the wet to the dried ettringite powder. Ettringite does not form a high-hydrate polymorph in equilibrium with alkaline solution, in contrast to the AFm phases that lose water molecules on drying. According to these results we conclude that ettringite precipitated in aqueous solution at the early hydration stages is of the same chemical composition as ettringite present in the hardening concrete.

  15. Comparing molecules and solids across structural and alchemical space.

    PubMed

    De, Sandip; Bartók, Albert P; Csányi, Gábor; Ceriotti, Michele

    2016-05-18

    Evaluating the (dis)similarity of crystalline, disordered and molecular compounds is a critical step in the development of algorithms to navigate automatically the configuration space of complex materials. For instance, a structural similarity metric is crucial for classifying structures, searching chemical space for better compounds and materials, and driving the next generation of machine-learning techniques for predicting the stability and properties of molecules and materials. In the last few years several strategies have been designed to compare atomic coordination environments. In particular, the smooth overlap of atomic positions (SOAPs) has emerged as an elegant framework to obtain translation, rotation and permutation-invariant descriptors of groups of atoms, underlying the development of various classes of machine-learned inter-atomic potentials. Here we discuss how one can combine such local descriptors using a regularized entropy match (REMatch) approach to describe the similarity of both whole molecular and bulk periodic structures, introducing powerful metrics that enable the navigation of alchemical and structural complexities within a unified framework. Furthermore, using this kernel and a ridge regression method we can predict atomization energies for a database of small organic molecules with a mean absolute error below 1 kcal mol(-1), reaching an important milestone in the application of machine-learning techniques for the evaluation of molecular properties. PMID:27101873

  16. Comparative population structure of cavity-nesting sea ducks

    USGS Publications Warehouse

    Pearce, John M.; Eadie, John M.; Savard, Jean-Pierre L.; Christensen, Thomas K.; Berdeen, James; Taylor, Eric J.; Boyd, Sean; Einarsson, Árni

    2014-01-01

    A growing collection of mtDNA genetic information from waterfowl species across North America suggests that larger-bodied cavity-nesting species exhibit greater levels of population differentiation than smaller-bodied congeners. Although little is known about nest-cavity availability for these species, one hypothesis to explain differences in population structure is reduced dispersal tendency of larger-bodied cavity-nesting species due to limited abundance of large cavities. To investigate this hypothesis, we examined population structure of three cavity-nesting waterfowl species distributed across much of North America: Barrow's Goldeneye (Bucephala islandica), Common Goldeneye (B. clangula), and Bufflehead (B. albeola). We compared patterns of population structure using both variation in mtDNA control-region sequences and band-recovery data for the same species and geographic regions. Results were highly congruent between data types, showing structured population patterns for Barrow's and Common Goldeneye but not for Bufflehead. Consistent with our prediction, the smallest cavity-nesting species, the Bufflehead, exhibited the lowest level of population differentiation due to increased dispersal and gene flow. Results provide evidence for discrete Old and New World populations of Common Goldeneye and for differentiation of regional groups of both goldeneye species in Alaska, the Pacific Northwest, and the eastern coast of North America. Results presented here will aid management objectives that require an understanding of population delineation and migratory connectivity between breeding and wintering areas. Comparative studies such as this one highlight factors that may drive patterns of genetic diversity and population trends.

  17. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    NASA Astrophysics Data System (ADS)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  18. CAPweb: a bioinformatics CGH array Analysis Platform.

    PubMed

    Liva, Stéphane; Hupé, Philippe; Neuvial, Pierre; Brito, Isabel; Viara, Eric; La Rosa, Philippe; Barillot, Emmanuel

    2006-07-01

    Assessing variations in DNA copy number is crucial for understanding constitutional or somatic diseases, particularly cancers. The recently developed array-CGH (comparative genomic hybridization) technology allows this to be investigated at the genomic level. We report the availability of a web tool for analysing array-CGH data. CAPweb (CGH array Analysis Platform on the Web) is intended as a user-friendly tool enabling biologists to completely analyse CGH arrays from the raw data to the visualization and biological interpretation. The user typically performs the following bioinformatics steps of a CGH array project within CAPweb: the secure upload of the results of CGH array image analysis and of the array annotation (genomic position of the probes); first level analysis of each array, including automatic normalization of the data (for correcting experimental biases), breakpoint detection and status assignment (gain, loss or normal); validation or deletion of the analysis based on a summary report and quality criteria; visualization and biological analysis of the genomic profiles and results through a user-friendly interface. CAPweb is accessible at http://bioinfo.curie.fr/CAPweb. PMID:16845053

  19. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    PubMed

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China. PMID:26351162

  20. High-throughput protein analysis integrating bioinformatics and experimental assays.

    PubMed

    del Val, Coral; Mehrle, Alexander; Falkenhahn, Mechthild; Seiler, Markus; Glatting, Karl-Heinz; Poustka, Annemarie; Suhai, Sandor; Wiemann, Stefan

    2004-01-01

    The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins. PMID:14762202

  1. Bioinformatics process management: information flow via a computational journal

    PubMed Central

    Feagan, Lance; Rohrer, Justin; Garrett, Alexander; Amthauer, Heather; Komp, Ed; Johnson, David; Hock, Adam; Clark, Terry; Lushington, Gerald; Minden, Gary; Frost, Victor

    2007-01-01

    This paper presents the Bioinformatics Computational Journal (BCJ), a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples. PMID:18053179

  2. Carving a niche: establishing bioinformatics collaborations

    PubMed Central

    Lyon, Jennifer A.; Tennant, Michele R.; Messner, Kevin R.; Osterbur, David L.

    2006-01-01

    Objectives: The paper describes collaborations and partnerships developed between library bioinformatics programs and other bioinformatics-related units at four academic institutions. Methods: A call for information on bioinformatics partnerships was made via email to librarians who have participated in the National Center for Biotechnology Information's Advanced Workshop for Bioinformatics Information Specialists. Librarians from Harvard University, the University of Florida, the University of Minnesota, and Vanderbilt University responded and expressed willingness to contribute information on their institutions, programs, services, and collaborating partners. Similarities and differences in programs and collaborations were identified. Results: The four librarians have developed partnerships with other units on their campuses that can be categorized into the following areas: knowledge management, instruction, and electronic resource support. All primarily support freely accessible electronic resources, while other campus units deal with fee-based ones. These demarcations are apparent in resource provision as well as in subsequent support and instruction. Conclusions and Recommendations: Through environmental scanning and networking with colleagues, librarians who provide bioinformatics support can develop fruitful collaborations. Visibility is key to building collaborations, as is broad-based thinking in terms of potential partners. PMID:16888668

  3. Comparative sequence-structure analysis of Aves insulin

    PubMed Central

    Islam, Md Mirazul; Aktaruzzaman, M; Mohamed, Zahurin

    2015-01-01

    Normal blood glucose level depends on the availability of insulin and its ability to bind insulin receptor (IR) that regulates the downstream signaling pathway. Insulin sequence and blood glucose level usually vary among animals due to species specificity. The study of genetic variation of insulin, blood glucose level and diabetics symptoms development in Aves is interesting because of its optimal high blood glucose level than mammals. Therefore, it is of interest to study its evolutionary relationship with other mammals using sequence data. Hence, we compiled 32 Aves insulin from GenBank to compare its sequence-structure features with phylogeny for evolutionary inference. The analysis shows long conserved motifs (about 14 residues) for functional inference. These sequences show high leucine content (20%) with high instability index (>40). Amino acid position 11, 14, 16 and 20 are variable that may have contribution to binding to IR. We identified functionally critical variable residues in the dataset for possible genetic implication. Structural models of these sequences were developed for surface analysis towards functional representation. These data find application in the understanding of insulin function across species. PMID:25848166

  4. Non-structural carbohydrates in woody plants compared among laboratories.

    PubMed

    Quentin, Audrey G; Pinkard, Elizabeth A; Ryan, Michael G; Tissue, David T; Baggett, L Scott; Adams, Henry D; Maillard, Pascale; Marchand, Jacqueline; Landhäusser, Simon M; Lacointe, André; Gibon, Yves; Anderegg, William R L; Asao, Shinichi; Atkin, Owen K; Bonhomme, Marc; Claye, Caroline; Chow, Pak S; Clément-Vidal, Anne; Davies, Noel W; Dickman, L Turin; Dumbur, Rita; Ellsworth, David S; Falk, Kristen; Galiano, Lucía; Grünzweig, José M; Hartmann, Henrik; Hoch, Günter; Hood, Sharon; Jones, Joanna E; Koike, Takayoshi; Kuhlmann, Iris; Lloret, Francisco; Maestro, Melchor; Mansfield, Shawn D; Martínez-Vilalta, Jordi; Maucourt, Mickael; McDowell, Nathan G; Moing, Annick; Muller, Bertrand; Nebauer, Sergio G; Niinemets, Ülo; Palacio, Sara; Piper, Frida; Raveh, Eran; Richter, Andreas; Rolland, Gaëlle; Rosas, Teresa; Saint Joanis, Brigitte; Sala, Anna; Smith, Renee A; Sterck, Frank; Stinziano, Joseph R; Tobias, Mari; Unda, Faride; Watanabe, Makoto; Way, Danielle A; Weerasinghe, Lasantha K; Wild, Birgit; Wiley, Erin; Woodruff, David R

    2015-11-01

    Non-structural carbohydrates (NSC) in plant tissue are frequently quantified to make inferences about plant responses to environmental conditions. Laboratories publishing estimates of NSC of woody plants use many different methods to evaluate NSC. We asked whether NSC estimates in the recent literature could be quantitatively compared among studies. We also asked whether any differences among laboratories were related to the extraction and quantification methods used to determine starch and sugar concentrations. These questions were addressed by sending sub-samples collected from five woody plant tissues, which varied in NSC content and chemical composition, to 29 laboratories. Each laboratory analyzed the samples with their laboratory-specific protocols, based on recent publications, to determine concentrations of soluble sugars, starch and their sum, total NSC. Laboratory estimates differed substantially for all samples. For example, estimates for Eucalyptus globulus leaves (EGL) varied from 23 to 116 (mean = 56) mg g(-1) for soluble sugars, 6-533 (mean = 94) mg g(-1) for starch and 53-649 (mean = 153) mg g(-1) for total NSC. Mixed model analysis of variance showed that much of the variability among laboratories was unrelated to the categories we used for extraction and quantification methods (method category R(2) = 0.05-0.12 for soluble sugars, 0.10-0.33 for starch and 0.01-0.09 for total NSC). For EGL, the difference between the highest and lowest least squares means for categories in the mixed model analysis was 33 mg g(-1) for total NSC, compared with the range of laboratory estimates of 596 mg g(-1). Laboratories were reasonably consistent in their ranks of estimates among tissues for starch (r = 0.41-0.91), but less so for total NSC (r = 0.45-0.84) and soluble sugars (r = 0.11-0.83). Our results show that NSC estimates for woody plant tissues cannot be compared among laboratories. The relative changes in NSC between treatments measured within a laboratory

  5. Bioinformatics Approaches to Classifying Allergens and Predicting Cross-Reactivity

    PubMed Central

    Schein, Catherine H.; Ivanciuc, Ovidiu; Braun, Werner

    2007-01-01

    The major advances in understanding why patients respond to several seemingly different stimuli have been through the isolation, sequencing and structural analysis of proteins that induce an IgE response. The most significant finding is that allergenic proteins from very different sources can have nearly identical sequences and structures, and that this similarity can account for clinically observed cross-reactivity. The increasing amount of information on the sequence, structure and IgE epitopes of allergens is now available in several databases and powerful bioinformatics search tools allow user access to relevant information. Here, we provide an overview of these databases and describe state-of-the art bioinformatics tools to identify the common proteins that may be at the root of multiple allergy syndromes. Progress has also been made in quantitatively defining characteristics that discriminate allergens from non-allergens. Search and software tools for this purpose have been developed and implemented in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/). SDAP contains information for over 800 allergens and extensive bibliographic references in a relational database with links to other publicly available databases. SDAP is freely available on the Web to clinicians and patients, and can be used to find structural and functional relations among known allergens and to identify potentially cross-reacting antigens. Here we illustrate how these bioinformatics tools can be used to group allergens, and to detect areas that may account for common patterns of IgE binding and cross-reactivity. Such results can be used to guide treatment regimens for allergy sufferers. PMID:17276876

  6. BioZone Exploting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    SciTech Connect

    Liu, L; Buttler, D; Paques, H; Pu, C; Critchlow

    2002-01-28

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  7. BioZoom: Exploiting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    SciTech Connect

    Liu, L; Buttler, D; Critchlow, T J; Han, W; Paques, H; Pu, C; Rocco, D

    2003-01-09

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  8. BioShaDock: a community driven bioinformatics shared Docker-based tools registry

    PubMed Central

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  9. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    PubMed

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  10. Structure, function and evolution of the gas exchangers: comparative perspectives

    PubMed Central

    Maina, JN

    2002-01-01

    Over the evolutionary continuum, animals have faced similar fundamental challenges of acquiring molecular oxygen for aerobic metabolism. Under limitations and constraints imposed by factors such as phylogeny, behaviour, body size and environment, they have responded differently in founding optimal respiratory structures. A quintessence of the aphorism that ‘necessity is the mother of invention’, gas exchangers have been inaugurated through stiff cost–benefit analyses that have evoked transaction of trade-offs and compromises. Cogent structural–functional correlations occur in constructions of gas exchangers: within and between taxa, morphological complexity and respiratory efficiency increase with metabolic capacities and oxygen needs. Highly active, small endotherms have relatively better-refined gas exchangers compared with large, inactive ectotherms. Respiratory structures have developed from the plain cell membrane of the primeval prokaryotic unicells to complex multifunctional ones ofthe modern Metazoa. Regarding the respiratory medium used to extract oxygen from, animal life has had only two choices – water or air – within the biological range of temperature and pressure the only naturally occurring respirable fluids. In rarer cases, certain animalshave adapted to using both media. Gills (evaginated gas exchangers) are the primordial respiratory organs: they are the archetypal water breathing organs. Lungs (invaginated gas exchangers) are the model air breathing organs. Bimodal (transitional) breathers occupy the water–air interface. Presentation and exposure of external (water/air) and internal (haemolymph/blood) respiratory media, features determined by geometric arrangement of the conduits, are important features for gas exchange efficiency: counter-current, cross-current, uniform pool and infinite pool designs have variably developed. PMID:12430953

  11. Bioinformatics Approaches for Predicting Disordered Protein Motifs.

    PubMed

    Bhowmick, Pallab; Guharoy, Mainak; Tompa, Peter

    2015-01-01

    Short, linear motifs (SLiMs) in proteins are functional microdomains consisting of contiguous residue segments along the protein sequence, typically not more than 10 consecutive amino acids in length with less than 5 defined positions. Many positions are 'degenerate' thus offering flexibility in terms of the amino acid types allowed at those positions. Their short length and degenerate nature confers evolutionary plasticity meaning that SLiMs often evolve convergently. Further, SLiMs have a propensity to occur within intrinsically unstructured protein segments and this confers versatile functionality to unstructured regions of the proteome. SLiMs mediate multiple types of protein interactions based on domain-peptide recognition and guide functions including posttranslational modifications, subcellular localization of proteins, and ligand binding. SLiMs thus behave as modular interaction units that confer versatility to protein function and SLiM-mediated interactions are increasingly being recognized as therapeutic targets. In this chapter we start with a brief description about the properties of SLiMs and their interactions and then move on to discuss algorithms and tools including several web-based methods that enable the discovery of novel SLiMs (de novo motif discovery) as well as the prediction of novel occurrences of known SLiMs. Both individual amino acid sequences as well as sets of protein sequences can be scanned using these methods to obtain statistically overrepresented sequence patterns. Lists of putatively functional SLiMs are then assembled based on parameters such as evolutionary sequence conservation, disorder scores, structural data, gene ontology terms and other contextual information that helps to assess the functional credibility or significance of these motifs. These bioinformatics methods should certainly guide experiments aimed at motif discovery. PMID:26387106

  12. Study on the Response Coefficient of Setback Structures Compared to Regular Moment Frame Structures

    SciTech Connect

    Mirghaderi, S. Rasoul; Khafaf, Bardia; Epackachi, Siamak

    2008-07-08

    In design practice of many countries, seismic analysis and proportioning of structures are usually based upon linear elastic analysis due to reduced seismic forces by response coefficient; R. Setback structures are one of the most popular shapes of the constructed buildings. In setback structures, the shape and proportions of the building have a major effect on distribution of earthquake forces as they work their way through the building. On the other hand, geometric configuration has a profound effect on the structural-dynamic response of a building. Therefore, when a building has irregular features, such as asymmetric in height or vertical discontinuity, the traditional assumptions used in development of seismic criteria for regular buildings may not be applicable. Inelastic seismic behavior of these types of structures seems to be quite different from the regular steel moment resisting structures in which the overall ductility is localized at beam-ends.In order to investigate the seismic behavior and estimate the Response Coefficient of those structures, nonlinear static analysis (pushover) are used for three categories of setback structures namely low rise, medium rise and high rise buildings with different setbacks in their height. The Response Coefficient are calculated and compared with those taken from regular type of moment frame structures.

  13. Applications of Support Vector Machines In Chemo And Bioinformatics

    NASA Astrophysics Data System (ADS)

    Jayaraman, V. K.; Sundararajan, V.

    2010-10-01

    Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.

  14. KDE Bioscience: platform for bioinformatics analysis workflows.

    PubMed

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research. PMID:16260186

  15. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  16. Bioinformatics: A History of Evolution "In Silico"

    ERIC Educational Resources Information Center

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  17. "Extreme Programming" in a Bioinformatics Class

    ERIC Educational Resources Information Center

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  18. Implementing bioinformatic workflows within the bioextract server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  19. Bioinformatics in Undergraduate Education: Practical Examples

    ERIC Educational Resources Information Center

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  20. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    EPA Science Inventory

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  1. 2010 Translational bioinformatics year in review

    PubMed Central

    Miller, Katharine S

    2011-01-01

    A review of 2010 research in translational bioinformatics provides much to marvel at. We have seen notable advances in personal genomics, pharmacogenetics, and sequencing. At the same time, the infrastructure for the field has burgeoned. While acknowledging that, according to researchers, the members of this field tend to be overly optimistic, the authors predict a bright future. PMID:21672905

  2. A novel approach to represent and compare RNA secondary structures

    PubMed Central

    Mattei, Eugenio; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela

    2014-01-01

    Structural information is crucial in ribonucleic acid (RNA) analysis and functional annotation; nevertheless, how to include such structural data is still a debated problem. Dot-bracket notation is the most common and simple representation for RNA secondary structures but its simplicity leads also to ambiguity requiring further processing steps to dissolve. Here we present BEAR (Brand nEw Alphabet for RNA), a new context-aware structural encoding represented by a string of characters. Each character in BEAR encodes for a specific secondary structure element (loop, stem, bulge and internal loop) with specific length. Furthermore, exploiting this informative and yet simple encoding in multiple alignments of related RNAs, we captured how much structural variation is tolerated in RNA families and convert it into transition rates among secondary structure elements. This allowed us to compute a substitution matrix for secondary structure elements called MBR (Matrix of BEAR-encoded RNA secondary structures), of which we tested the ability in aligning RNA secondary structures. We propose BEAR and the MBR as powerful resources for the RNA secondary structure analysis, comparison and classification, motif finding and phylogeny. PMID:24753415

  3. Navigating the changing learning landscape: perspective from bioinformatics.ca

    PubMed Central

    Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  4. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    PubMed

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  5. Agile parallel bioinformatics workflow management using Pwrake

    PubMed Central

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  6. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects. PMID:26351170

  7. Component-Based Approach for Educating Students in Bioinformatics

    ERIC Educational Resources Information Center

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  8. Combining Bioinformatics and Phylogenetics to Identify Large Sets of Single-Copy Orthologous Genes (COSII) for Comparative, Evolutionary and Systematic Studies: A Test Case in the Euasterid Plant Clade

    PubMed Central

    Wu, Feinan; Mueller, Lukas A.; Crouzillat, Dominique; Pétiard, Vincent; Tanksley, Steven D.

    2006-01-01

    We report herein the application of a set of algorithms to identify a large number (2869) of single-copy orthologs (COSII), which are shared by most, if not all, euasterid plant species as well as the model species Arabidopsis. Alignments of the orthologous sequences across multiple species enabled the design of “universal PCR primers,” which can be used to amplify the corresponding orthologs from a broad range of taxa, including those lacking any sequence databases. Functional annotation revealed that these conserved, single-copy orthologs encode a higher-than-expected frequency of proteins transported and utilized in organelles and a paucity of proteins associated with cell walls, protein kinases, transcription factors, and signal transduction. The enabling power of this new ortholog resource was demonstrated in phylogenetic studies, as well as in comparative mapping across the plant families tomato (family Solanaceae) and coffee (family Rubiaceae). The combined results of these studies provide compelling evidence that (1) the ancestral species that gave rise to the core euasterid families Solanaceae and Rubiaceae had a basic chromosome number of x = 11 or 12.2) No whole-genome duplication event (i.e., polyploidization) occurred immediately prior to or after the radiation of either Solanaceae or Rubiaceae as has been recently suggested. PMID:16951058

  9. Quantifying variances in comparative RNA secondary structure prediction

    PubMed Central

    2013-01-01

    Background With the advancement of next-generation sequencing and transcriptomics technologies, regulatory effects involving RNA, in particular RNA structural changes are being detected. These results often rely on RNA secondary structure predictions. However, current approaches to RNA secondary structure modelling produce predictions with a high variance in predictive accuracy, and we have little quantifiable knowledge about the reasons for these variances. Results In this paper we explore a number of factors which can contribute to poor RNA secondary structure prediction quality. We establish a quantified relationship between alignment quality and loss of accuracy. Furthermore, we define two new measures to quantify uncertainty in alignment-based structure predictions. One of the measures improves on the “reliability score” reported by PPfold, and considers alignment uncertainty as well as base-pair probabilities. The other measure considers the information entropy for SCFGs over a space of input alignments. Conclusions Our predictive accuracy improves on the PPfold reliability score. We can successfully characterize many of the underlying reasons for and variances in poor prediction. However, there is still variability unaccounted for, which we therefore suggest comes from the RNA secondary structure predictive model itself. PMID:23634662

  10. Can bioinformatics help in the identification of moonlighting proteins?

    PubMed

    Hernández, Sergio; Calvo, Alejandra; Ferragut, Gabriela; Franco, Luís; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2014-12-01

    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place. PMID:25399591

  11. Bioinformatic characterization of plant networks

    SciTech Connect

    McDermott, Jason E.; Samudrala, Ram

    2008-06-30

    Cells and organisms are governed by networks of interactions, genetic, physical and metabolic. Large-scale experimental studies of interactions between components of biological systems have been performed for a variety of eukaryotic organisms. However, there is a dearth of such data for plants. Computational methods for prediction of relationships between proteins, primarily based on comparative genomics, provide a useful systems-level view of cellular functioning and can be used to extend information about other eukaryotes to plants. We have predicted networks for Arabidopsis thaliana, Oryza sativa indica and japonica and several plant pathogens using the Bioverse (http://bioverse.compbio.washington.edu) and show that they are similar to experimentally-derived interaction networks. Predicted interaction networks for plants can be used to provide novel functional annotations and predictions about plant phenotypes and aid in rational engineering of biosynthesis pathways.

  12. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure

    NASA Astrophysics Data System (ADS)

    Lima, Thamires A.; Paschoal, Vitor H.; Faria, Luiz F. O.; Ribeiro, Mauro C. C.; Giles, Carlos

    2016-06-01

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114]+ and [N1444]+ proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444]+ as to [N1114]+ because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114]+ cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids.

  13. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure.

    PubMed

    Lima, Thamires A; Paschoal, Vitor H; Faria, Luiz F O; Ribeiro, Mauro C C; Giles, Carlos

    2016-06-14

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114](+) and [N1444](+) proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444](+) as to [N1114](+) because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114](+) cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids. PMID:27306015

  14. A toolbox for developing bioinformatics software.

    PubMed

    Rother, Kristian; Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M

    2012-03-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  15. Novel bioinformatic developments for exome sequencing.

    PubMed

    Lelieveld, Stefan H; Veltman, Joris A; Gilissen, Christian

    2016-06-01

    With the widespread adoption of next generation sequencing technologies by the genetics community and the rapid decrease in costs per base, exome sequencing has become a standard within the repertoire of genetic experiments for both research and diagnostics. Although bioinformatics now offers standard solutions for the analysis of exome sequencing data, many challenges still remain; especially the increasing scale at which exome data are now being generated has given rise to novel challenges in how to efficiently store, analyze and interpret exome data of this magnitude. In this review we discuss some of the recent developments in bioinformatics for exome sequencing and the directions that this is taking us to. With these developments, exome sequencing is paving the way for the next big challenge, the application of whole genome sequencing. PMID:27075447

  16. A toolbox for developing bioinformatics software

    PubMed Central

    Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.

    2012-01-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  17. Bioinformatics in New Generation Flavivirus Vaccines

    PubMed Central

    Koraka, Penelope; Martina, Byron E. E.; Osterhaus, Albert D. M. E.

    2010-01-01

    Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed. PMID:20467477

  18. Discovery and Classification of Bioinformatics Web Services

    SciTech Connect

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  19. Translational bioinformatics applications in genome medicine

    PubMed Central

    2009-01-01

    Although investigators using methodologies in bioinformatics have always been useful in genomic experimentation in analytic, engineering, and infrastructure support roles, only recently have bioinformaticians been able to have a primary scientific role in asking and answering questions on human health and disease. Here, I argue that this shift in role towards asking questions in medicine is now the next step needed for the field of bioinformatics. I outline four reasons why bioinformaticians are newly enabled to drive the questions in primary medical discovery: public availability of data, intersection of data across experiments, commoditization of methods, and streamlined validation. I also list four recommendations for bioinformaticians wishing to get more involved in translational research. PMID:19566916

  20. How do disordered regions achieve comparable functions to structured domains?

    PubMed Central

    Latysheva, Natasha S; Flock, Tilman; Weatheritt, Robert J; Chavali, Sreenivas; Babu, M Madan

    2015-01-01

    The traditional structure to function paradigm conceives of a protein's function as emerging from its structure. In recent years, it has been established that unstructured, intrinsically disordered regions (IDRs) in proteins are equally crucial elements for protein function, regulation and homeostasis. In this review, we provide a brief overview of how IDRs can perform similar functions to structured proteins, focusing especially on the formation of protein complexes and assemblies and the mediation of regulated conformational changes. In addition to highlighting instances of such functional equivalence, we explain how differences in the biological and physicochemical properties of IDRs allow them to expand the functional and regulatory repertoire of proteins. We also discuss studies that provide insights into how mutations within functional regions of IDRs can lead to human diseases. PMID:25752799

  1. Why Polyphenols have Promiscuous Actions? An Investigation by Chemical Bioinformatics.

    PubMed

    Tang, Guang-Yan

    2016-05-01

    Despite their diverse pharmacological effects, polyphenols are poor for use as drugs, which have been traditionally ascribed to their low bioavailability. However, Baell and co-workers recently proposed that the redox potential of polyphenols also plays an important role in this, because redox reactions bring promiscuous actions on various protein targets and thus produce non-specific pharmacological effects. To investigate whether the redox reactivity behaves as a critical factor in polyphenol promiscuity, we performed a chemical bioinformatics analysis on the structure-activity relationships of twenty polyphenols. It was found that the gene expression profiles of human cell lines induced by polyphenols were not correlated with the presence or not of redox moieties in the polyphenols, but significantly correlated with their molecular structures. Therefore, it is concluded that the promiscuous actions of polyphenols are likely to result from their inherent structural features rather than their redox potential. PMID:27319142

  2. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  3. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  4. Application of Bioinformatics in Chronobiology Research

    PubMed Central

    Lopes, Robson da Silva; Resende, Nathalia Maria; Honorio-França, Adenilda Cristina; França, Eduardo Luzía

    2013-01-01

    Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research. PMID:24187519

  5. Bioinformatics on the cloud computing platform Azure.

    PubMed

    Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  6. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    PubMed

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease. PMID:22933157

  7. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing. PMID:27217183

  8. Bioinformatics on the Cloud Computing Platform Azure

    PubMed Central

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  9. Comparative static curing versus dynamic curing on tablet coating structures.

    PubMed

    Gendre, Claire; Genty, Muriel; Fayard, Barbara; Tfayli, Ali; Boiret, Mathieu; Lecoq, Olivier; Baron, Michel; Chaminade, Pierre; Péan, Jean Manuel

    2013-09-10

    Curing is generally required to stabilize film coating from aqueous polymer dispersion. This post-coating drying step is traditionally carried out in static conditions, requiring the transfer of solid dosage forms to an oven. But, curing operation performed directly inside the coating equipment stands for an attractive industrial application. Recently, the use of various advanced physico-chemical characterization techniques i.e., X-ray micro-computed tomography, vibrational spectroscopies (near infrared and Raman) and X-ray microdiffraction, allowed new insights into the film-coating structures of dynamically cured tablets. Dynamic curing end-point was efficiently determined after 4h. The aim of the present work was to elucidate the influence of curing conditions on film-coating structures. Results demonstrated that 24h of static curing and 4h of dynamic curing, both performed at 60°C and ambient relative humidity, led to similar coating layers in terms of drug release properties, porosity, water content, structural rearrangement of polymer chains and crystalline distribution. Furthermore, X-ray microdiffraction measurements pointed out different crystalline coating compositions depending on sample storage time. An aging mechanism might have occur during storage, resulting in the crystallization and the upward migration of cetyl alcohol, coupled to the downward migration of crystalline sodium lauryl sulfate within the coating layer. Interestingly, this new study clearly provided further knowledge into film-coating structures after a curing step and confirmed that curing operation could be performed in dynamic conditions. PMID:23792043

  10. Comparative Effectiveness of Contextual and Structural Method of Teaching Vocabulary

    ERIC Educational Resources Information Center

    Behlol, Malik; Kaini, Mohammad Munir

    2011-01-01

    The study was conducted to find out effectiveness of contextual an, structural method of teaching vocabulary in English at secondary level. It was an experimental study in which the pretest posttest design was used. The population of the study was the students of secondary classes studying in Government secondary schools of Rawalpindi District.…

  11. The Structure of Women's Employment in Comparative Perspective

    ERIC Educational Resources Information Center

    Pettit, Becky; Hook, Jennifer Lynn

    2005-01-01

    In this paper we analyze social survey data from 19 countries using multi-level modeling methods in an effort to synthesize structural and institutional accounts for variation in women's employment. Observed demographic characteristics show much consistency in their relationship to women's employment across countries, yet there is significant…

  12. Comparative structural biology of eubacterial and archaeal oligosaccharyltransferases.

    PubMed

    Maita, Nobuo; Nyirenda, James; Igura, Mayumi; Kamishikiryo, Jun; Kohda, Daisuke

    2010-02-12

    Oligosaccharyltransferase (OST) catalyzes the transfer of an oligosaccharide from a lipid donor to an asparagine residue in nascent polypeptide chains. In the bacterium Campylobacter jejuni, a single-subunit membrane protein, PglB, catalyzes N-glycosylation. We report the 2.8 A resolution crystal structure of the C-terminal globular domain of PglB and its comparison with the previously determined structure from the archaeon Pyrococcus AglB. The two distantly related oligosaccharyltransferases share unexpected structural similarity beyond that expected from the sequence comparison. The common architecture of the putative catalytic sites revealed a new catalytic motif in PglB. Site-directed mutagenesis analyses confirmed the contribution of this motif to the catalytic function. Bacterial PglB and archaeal AglB constitute a protein family of the catalytic subunit of OST along with STT3 from eukaryotes. A structure-aided multiple sequence alignment of the STT3/PglB/AglB protein family revealed three types of OST catalytic centers. This novel classification will provide a useful framework for understanding the enzymatic properties of the OST enzymes from Eukarya, Archaea, and Bacteria. PMID:20007322

  13. Structural and Social Psychological Correlates of Prisonization: A Comparative Analysis.

    ERIC Educational Resources Information Center

    Thomas, Charles W.; And Others

    This study considers some aspects of "prisonization," or the process by which inmates adapt to confinement. Specifically, it further examines two ideas suggested by earlier studies. One is the belief that the structural characteristics of many prisons promote rather than inhibit assimilation into an inmate normative system that is opposed to the…

  14. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology.

    PubMed

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-12-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology. PMID:26753026

  15. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology

    PubMed Central

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-01-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology. PMID:26753026

  16. Bioinformatics for transporter pharmacogenomics and systems biology: data integration and modeling with UML.

    PubMed

    Yan, Qing

    2010-01-01

    Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields. PMID:20419428

  17. Quantum Bio-Informatics IV

    NASA Astrophysics Data System (ADS)

    Accardi, Luigi; Freudenberg, Wolfgang; Ohya, Masanori

    2011-01-01

    .Use of cryptographic ideas to interpret biological phenomena (and vice versa) / M. Regoli -- Discrete approximation to operators in white noise analysis / Si Si -- Bogoliubov type equations via infinite-dimensional equations for measures / V. V. Kozlov and O. G. Smolyanov -- Analysis of several categorical data using measure of proportional reduction in variation / K. Yamamoto ... [et al.] -- The electron reservoir hypothesis for two-dimensional electron systems / K. Yamada ... [et al.] -- On the correspondence between Newtonian and functional mechanics / E. V. Piskovskiy and I. V. Volovich -- Quantile-quantile plots: An approach for the inter-species comparison of promoter architecture in eukaryotes / K. Feldmeier ... [et al.] -- Entropy type complexities in quantum dynamical processes / N. Watanabe -- A fair sampling test for Ekert protocol / G. Adenier, A. Yu. Khrennikov and N. Watanabe -- Brownian dynamics simulation of macromolecule diffusion in a protocell / T. Ando and J. Skolnick -- Signaling network of environmental sensing and adaptation in plants: Key roles of calcium ion / K. Kuchitsu and T. Kurusu -- NetzCope: A tool for displaying and analyzing complex networks / M. J. Barber, L. Streit and O. Strogan -- Study of HIV-1 evolution by coding theory and entropic chaos degree / K. Sato -- The prediction of botulinum toxin structure based on in silico and in vitro analysis / T. Suzuki and S. Miyazaki -- On the mechanism of D-wave high T[symbol] superconductivity by the interplay of Jahn-Teller physics and Mott physics / H. Ushio, S. Matsuno and H. Kamimura.

  18. Entropyology: the application of bioinformatics and data modeling to digital virus and malware recognition

    NASA Astrophysics Data System (ADS)

    Jaenisch, Holger M.; Handley, James W.

    2010-04-01

    Malware are analogs of viruses. Viruses are comprised of large numbers of polypeptide proteins. The shape and function of the protein strands determines the functionality of the segment, similar to a subroutine in malware. The full combination of subroutines is the malware organism, in analogous fashion as a collection of polypeptides forms protein structures that are information bearing. We propose to apply the methods of Bioinformatics to analyze malware to provide a rich feature set for creating a unique and novel detection and classification scheme that is originally applied to Bioinformatics amino acid sequencing. Our proposed methods enable real time in situ (in contrast to in vivo) detection applications.

  19. Associations between Input and Outcome Variables in an Online High School Bioinformatics Instructional Program

    NASA Astrophysics Data System (ADS)

    Lownsbery, Douglas S.

    Quantitative data from a completed year of an innovative online high school bioinformatics instructional program were analyzed as part of a descriptive research study. The online instructional program provided the opportunity for high school students to develop content understandings of molecular genetics and to use sophisticated bioinformatics tools and methodologies to conduct authentic research. Quantitative data were analyzed to identify potential associations between independent program variables including implementation setting, gender, and student educational backgrounds and dependent variables indicating success in the program including completion rates for analyzing DNA clones and performance gains from pre-to-post assessments of bioinformatics knowledge. Study results indicate that understanding associations between student educational backgrounds and level of success may be useful for structuring collaborative learning groups and enhancing scaffolding and support during the program to promote higher levels of success for participating students.

  20. Parallel algorithm research on several important open problems in bioinformatics.

    PubMed

    Niu, Bei-Fang; Lang, Xian-Yu; Lu, Zhong-Hua; Chi, Xue-Bin

    2009-09-01

    High performance computing has opened the door to using bioinformatics and systems biology to explore complex relationships among data, and created the opportunity to tackle very large and involved simulations of biological systems. Many supercomputing centers have jumped on the bandwagon because the opportunities for significant impact in this field is infinite. Development of new algorithms, especially parallel algorithms and software to mine new biological information and to assess different relationships among the members of a large biological data set, is becoming very important. This article presents our work on the design and development of parallel algorithms and software to solve some important open problems arising from bioinformatics, such as structure alignment of RNA sequences, finding new genes, alternative splicing, gene expression clustering and so on. In order to make these parallel software available to a wide audience, the grid computing service interfaces to these software have been deployed in China National Grid (CNGrid). Finally, conclusions and some future research directions are presented. PMID:20640837

  1. Evolving Strategies for the Incorporation of Bioinformatics within the Undergraduate Cell Biology Curriculum

    ERIC Educational Resources Information Center

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…

  2. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    ERIC Educational Resources Information Center

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  3. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  4. Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.

    1996-01-01

    Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.

  5. Shared bioinformatics databases within the Unipro UGENE platform.

    PubMed

    Protsyuk, Ivan V; Grekhov, German A; Tiunov, Alexey V; Fursov, Mikhail Y

    2015-01-01

    Unipro UGENE is an open-source bioinformatics toolkit that integrates popular tools along with original instruments for molecular biologists within a unified user interface. Nowadays, most bioinformatics desktop applications, including UGENE, make use of a local data model while processing different types of data. Such an approach causes an inconvenience for scientists working cooperatively and relying on the same data. This refers to the need of making multiple copies of certain files for every workplace and maintaining synchronization between them in case of modifications. Therefore, we focused on delivering a collaborative work into the UGENE user experience. Currently, several UGENE installations can be connected to a designated shared database and users can interact with it simultaneously. Such databases can be created by UGENE users and be used at their discretion. Objects of each data type, supported by UGENE such as sequences, annotations, multiple alignments, etc., can now be easily imported from or exported to a remote storage. One of the main advantages of this system, compared to existing ones, is the almost simultaneous access of client applications to shared data regardless of their volume. Moreover, the system is capable of storing millions of objects. The storage itself is a regular database server so even an inexpert user is able to deploy it. Thus, UGENE may provide access to shared data for users located, for example, in the same laboratory or institution. UGENE is available at: http://ugene.net/download.html. PMID:26527191

  6. Bioinformatics-Driven New Immune Target Discovery in Disease.

    PubMed

    Yang, C; Chen, P; Zhang, W; Du, H

    2016-08-01

    Biomolecular network analysis has been widely applied in the discovery of cancer driver genes and molecular mechanism anatomization of many diseases on the genetic level. However, the application of such approach in the potential antigen discovery of autoimmune diseases remains largely unexplored. Here, we describe a previously uncharacterized region, with disease-associated autoantigens, to build antigen networks with three bioinformatics tools, namely NetworkAnalyst, GeneMANIA and ToppGene. First, we identified histone H2AX as an antigen of systemic lupus erythematosus by comparing highly ranked genes from all the built network-derived gene lists, and then a new potential biomarker for Behcet's disease, heat shock protein HSP 90-alpha (HSP90AA1), was further screened out. Moreover, 130 confirmed patients were enrolled and a corresponding enzyme-linked immunosorbent assay, mass spectrum analysis and immunoprecipitation were performed to further confirm the bioinformatics results with real-world clinical samples in succession. Our findings demonstrate that the combination of multiple molecular network approaches is a promising tool to discover new immune targets in diseases. PMID:27226232

  7. Comparative structural and optical properties of different ceria nanoparticles.

    PubMed

    Nikolic, A S; Boskovic, M; Fabian, M; Bozanic, D K; Vucinic-Vasic, M; Kremenovic, A; Antic, B

    2013-10-01

    Herein a comparative study of five nanocrystalline cerium oxides (CeO(2-delta)) synthesised by different methods and calcined at 500 degrees C is reported. XRPD analysis showed that stoichiometry parameter delta, crystallite size/strain and lattice constant were only slightly affected by the method utilized. All ceria nanoparticles are nearly spherical in shape with faceted morphology, free of defects and with a relatively uniform size distribution. The average microstrain was found to be approximately 10 times higher than that of bulk counterpart. The absorption edge of nanocrystalline materials was shifted towards a higher wavelengths (red shift) in comparison with bulk counterpart, and band gap values were in the range 2.7-3.24 eV (3.33 eV for bulk counterpart). PMID:24245144

  8. Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

    PubMed

    Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

    2016-01-01

    One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. PMID:27238013

  9. InCoB2010 - 9th International Conference on Bioinformatics at Tokyo, Japan, September 26-28, 2010

    PubMed Central

    2010-01-01

    The International Conference on Bioinformatics (InCoB), the annual conference of the Asia-Pacific Bioinformatics Network (APBioNet), is hosted in one of countries of the Asia-Pacific region. The 2010 conference was awarded to Japan and has attracted more than one hundred high-quality research paper submissions. Thorough peer reviewing resulted in 47 (43.5%) accepted papers out of 108 submissions. Submissions from Japan, R.O. Korea, P.R. China, Australia, Singapore and U.S.A totaled 43.8% and contributed to 57.4% of accepted papers. Manuscripts originating from Taiwan and India added up to 42.8% of submissions and 28.3% of acceptances. The fifteen articles published in this BMC Bioinformatics supplement cover disease informatics, structural bioinformatics and drug design, biological databases and software tools, signaling pathways, gene regulatory and biochemical networks, evolution and sequence analysis. PMID:21106116

  10. Is racism dead? Comparing (expressive) means and (structural equation) models.

    PubMed

    Leach, C W; Peng, T R; Volckens, J

    2000-09-01

    Much scholarship suggests that racism--belief in out-group inferiority--is unrelated to contemporary attitudes. Purportedly, a new form of racism, one which relies upon a belief in cultural difference, has become a more acceptable basis for such attitudes. The authors argue that an appropriate empirical assessment of racism (both 'old' and 'new') depends upon (1) clear conceptualization and operationalization, and (2) attention to both mean-level expression and explanatory value in structural equation models. This study assessed the endorsement of racism and belief in cultural difference as well as their association with a measure of general attitude in a secondary analysis of parallel representative surveys of attitudes toward different ethnic out-groups in France, The Netherlands, Western Germany and Britain (N = 3242; see Reif & Melich, 1991). For six of the seven out-group targets, racism was strongly related to ethnic majority attitudes, despite low mean-level endorsement. In a pattern consistent with a 'new', indirect racism, the relationship between British racism and attitudes toward Afro-Caribbeans was mediated by belief in cultural difference. PMID:11041013