Science.gov

Sample records for comparative structural bioinformatics

  1. A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Family Ectodomain Based on Phylogenetic Information

    PubMed Central

    Rentería, Miguel E.; Gandhi, Neha S.; Vinuesa, Pablo; Helmerhorst, Erik; Mancera, Ricardo L.

    2008-01-01

    The insulin receptor (IR), the insulin-like growth factor 1 receptor (IGF1R) and the insulin receptor-related receptor (IRR) are covalently-linked homodimers made up of several structural domains. The molecular mechanism of ligand binding to the ectodomain of these receptors and the resulting activation of their tyrosine kinase domain is still not well understood. We have carried out an amino acid residue conservation analysis in order to reconstruct the phylogeny of the IR Family. We have confirmed the location of ligand binding site 1 of the IGF1R and IR. Importantly, we have also predicted the likely location of the insulin binding site 2 on the surface of the fibronectin type III domains of the IR. An evolutionary conserved surface on the second leucine-rich domain that may interact with the ligand could not be detected. We suggest a possible mechanical trigger of the activation of the IR that involves a slight ‘twist’ rotation of the last two fibronectin type III domains in order to face the likely location of insulin. Finally, a strong selective pressure was found amongst the IRR orthologous sequences, suggesting that this orphan receptor has a yet unknown physiological role which may be conserved from amphibians to mammals. PMID:18989367

  2. Structural bioinformatics of the human spliceosomal proteome

    PubMed Central

    Korneta, Iga; Magnus, Marcin; Bujnicki, Janusz M.

    2012-01-01

    In this work, we describe the results of a comprehensive structural bioinformatics analysis of the spliceosomal proteome. We used fold recognition analysis to complement prior data on the ordered domains of 252 human splicing proteins. Examples of newly identified domains include a PWI domain in the U5 snRNP protein 200K (hBrr2, residues 258–338), while examples of previously known domains with a newly determined fold include the DUF1115 domain of the U4/U6 di-snRNP protein 90K (hPrp3, residues 540–683). We also established a non-redundant set of experimental models of spliceosomal proteins, as well as constructed in silico models for regions without an experimental structure. The combined set of structural models is available for download. Altogether, over 90% of the ordered regions of the spliceosomal proteome can be represented structurally with a high degree of confidence. We analyzed the reduced spliceosomal proteome of the intron-poor organism Giardia lamblia, and as a result, we proposed a candidate set of ordered structural regions necessary for a functional spliceosome. The results of this work will aid experimental and structural analyses of the spliceosomal proteins and complexes, and can serve as a starting point for multiscale modeling of the structure of the entire spliceosome. PMID:22573172

  3. NMR structure improvement: A structural bioinformatics & visualization approach

    NASA Astrophysics Data System (ADS)

    Block, Jeremy N.

    The overall goal of this project is to enhance the physical accuracy of individual models in macromolecular NMR (Nuclear Magnetic Resonance) structures and the realism of variation within NMR ensembles of models, while improving agreement with the experimental data. A secondary overall goal is to combine synergistically the best aspects of NMR and crystallographic methodologies to better illuminate the underlying joint molecular reality. This is accomplished by using the powerful method of all-atom contact analysis (describing detailed sterics between atoms, including hydrogens); new graphical representations and interactive tools in 3D and virtual reality; and structural bioinformatics approaches to the expanded and enhanced data now available. The resulting better descriptions of macromolecular structure and its dynamic variation enhances the effectiveness of the many biomedical applications that depend on detailed molecular structure, such as mutational analysis, homology modeling, molecular simulations, protein design, and drug design.

  4. Bioinformatic Analysis of Toll-Like Receptor Sequences and Structures.

    PubMed

    Monie, Tom P; Gay, Nicholas J; Gangloff, Monique

    2016-01-01

    Continual advancements in computing power and sophistication, coupled with rapid increases in protein sequence and structural information, have made bioinformatic tools an invaluable resource for the molecular and structural biologist. With the degree of sequence information continuing to expand at an almost exponential rate, it is essential that scientists today have a basic understanding of how to utilise, manipulate and analyse this information for the benefit of their own experiments. In the context of Toll-Interleukin I Receptor domain containing proteins, we describe here a series of the more common and user-friendly bioinformatic tools available as Internet-based resources. These will enable the identification and alignment of protein sequences; the identification of functional motifs; the characterisation of protein secondary structure; the identification of protein structural folds and distantly homologous proteins; and the validation of the structural geometry of modelled protein structures. PMID:26803620

  5. Teaching Structural Bioinformatics at the Undergraduate Level

    ERIC Educational Resources Information Center

    Centeno, Nuria B.; Villa-Freixa, Jordi; Oliva, Baldomero

    2003-01-01

    Understanding the basic principles of structural biology is becoming a major subject of study in most undergraduate level programs in biology. In the genomic and proteomic age, it is becoming indispensable for biology students to master concepts related to the sequence and structure of proteins in order to develop skills that may be useful in a…

  6. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  7. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    ERIC Educational Resources Information Center

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  8. Comparative modeling of proteins: a method for engaging students' interest in bioinformatics tools.

    PubMed

    Badotti, Fernanda; Barbosa, Alan Sales; Reis, André Luiz Martins; do Valle, Italo Faria; Ambrósio, Lara; Bitar, Mainá

    2014-01-01

    The huge increase in data being produced in the genomic era has produced a need to incorporate computers into the research process. Sequence generation, its subsequent storage, interpretation, and analysis are now entirely computer-dependent tasks. Universities from all over the world have been challenged to seek a way of encouraging students to incorporate computational and bioinformatics skills since undergraduation in order to understand biological processes. The aim of this article is to report the experience of awakening students' interest in bioinformatics tools during a course focused on comparative modeling of proteins. The authors start by giving a full description of the course environmental context and students' backgrounds. Then they detail each class and present a general overview of the protein modeling protocol. The positive and negative aspects of the course are also reported, and some of the results generated in class and in projects outside the classroom are discussed. In the last section of the article, general perspectives about the course from students' point of view are given. This work can serve as a guide for professors who teach subjects for which bioinformatics tools are useful and for universities that plan to incorporate bioinformatics into the curriculum. PMID:24167006

  9. Abstractions, algorithms and data structures for structural bioinformatics in PyCogent

    PubMed Central

    Cieślik, Marcin; Derewenda, Zygmunt S.; Mura, Cameron

    2011-01-01

    To facilitate flexible and efficient structural bioinformatics analyses, new functionality for three-dimensional structure processing and analysis has been introduced into PyCogent – a popular feature-rich framework for sequence-based bioinformatics, but one which has lacked equally powerful tools for handling stuctural/coordinate-based data. Extensible Python modules have been developed, which provide object-oriented abstractions (based on a hierarchical representation of macromolecules), efficient data structures (e.g. kD-trees), fast implementations of common algorithms (e.g. surface-area calculations), read/write support for Protein Data Bank-related file formats and wrappers for external command-line applications (e.g. Stride). Integration of this code into PyCogent is symbiotic, allowing sequence-based work to benefit from structure-derived data and, reciprocally, enabling structural studies to leverage PyCogent’s versatile tools for phylogenetic and evolutionary analyses. PMID:22479120

  10. Human, vector and parasite Hsp90 proteins: A comparative bioinformatics analysis

    PubMed Central

    Faya, Ngonidzashe; Penkler, David L.; Tastan Bishop, Özlem

    2015-01-01

    The treatment of protozoan parasitic diseases is challenging, and thus identification and analysis of new drug targets is important. Parasites survive within host organisms, and some need intermediate hosts to complete their life cycle. Changing host environment puts stress on parasites, and often adaptation is accompanied by the expression of large amounts of heat shock proteins (Hsps). Among Hsps, Hsp90 proteins play an important role in stress environments. Yet, there has been little computational research on Hsp90 proteins to analyze them comparatively as potential parasitic drug targets. Here, an attempt was made to gain detailed insights into the differences between host, vector and parasitic Hsp90 proteins by large-scale bioinformatics analysis. A total of 104 Hsp90 sequences were divided into three groups based on their cellular localizations; namely cytosolic, mitochondrial and endoplasmic reticulum (ER). Further, the parasitic proteins were divided according to the type of parasite (protozoa, helminth and ectoparasite). Primary sequence analysis, phylogenetic tree calculations, motif analysis and physicochemical properties of Hsp90 proteins suggested that despite the overall structural conservation of these proteins, parasitic Hsp90 proteins have unique features which differentiate them from human ones, thus encouraging the idea that protozoan Hsp90 proteins should be further analyzed as potential drug targets. PMID:26793431

  11. Computer programming and biomolecular structure studies: A step beyond internet bioinformatics.

    PubMed

    Likić, Vladimir A

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled Biomolecular Structure and Bioinformatics. Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics approach that relies on access to the Internet and biological databases. This was an ambitious approach considering that the students mostly had a biological background. There were also time constraints of eight lectures in total and two accompanying practical sessions. The main challenge was that students had to be introduced to computer programming from a beginner level and in a short time provided with enough knowledge to independently solve a simple bioinformatics problem. This was accomplished with a problem directly relevant to the rest of the subject, concerned with the structure-function relationships and experimental techniques for the determination of macromolecular structure. PMID:21638623

  12. NETTAB 2014: From high-throughput structural bioinformatics to integrative systems biology.

    PubMed

    Romano, Paolo; Cordero, Francesca

    2016-01-01

    The fourteenth NETTAB workshop, NETTAB 2014, was devoted to a range of disciplines going from structural bioinformatics, to proteomics and to integrative systems biology. The topics of the workshop were centred around bioinformatics methods, tools, applications, and perspectives for models, standards and management of high-throughput biological data, structural bioinformatics, functional proteomics, mass spectrometry, drug discovery, and systems biology.43 scientific contributions were presented at NETTAB 2014, including keynote, special guest and tutorial talks, oral communications, and posters. Full papers from some of the best contributions presented at the workshop were later submitted to a special Call for this Supplement.Here, we provide an overview of the workshop and introduce manuscripts that have been accepted for publication in this Supplement. PMID:26960985

  13. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function. PMID:21350909

  14. Comparing bioinformatic gene expression profiling methods: microarray and RNA-Seq.

    PubMed

    Mantione, Kirk J; Kream, Richard M; Kuzelova, Hana; Ptacek, Radek; Raboch, Jiri; Samuel, Joshua M; Stefano, George B

    2014-01-01

    Understanding the control of gene expression is critical for our understanding of the relationship between genotype and phenotype. The need for reliable assessment of transcript abundance in biological samples has driven scientists to develop novel technologies such as DNA microarray and RNA-Seq to meet this demand. This review focuses on comparing the two most useful methods for whole transcriptome gene expression profiling. Microarrays are reliable and more cost effective than RNA-Seq for gene expression profiling in model organisms. RNA-Seq will eventually be used more routinely than microarray, but right now the techniques can be complementary to each other. Microarrays will not become obsolete but might be relegated to only a few uses. RNA-Seq clearly has a bright future in bioinformatic data collection. PMID:25149683

  15. AWSEM-MD: Protein Structure Prediction Using Coarse-grained Physical Potentials and Bioinformatically Based Local Structure Biasing

    PubMed Central

    Davtyan, Aram; Schafer, Nicholas P.; Zheng, Weihua; Clementi, Cecilia; Wolynes, Peter G.; Papoian, Garegin A.

    2012-01-01

    The Associative memory, Water mediated, Structure and Energy Model (AWSEM) is a coarse-grained protein force field. AWSEM contains physically motivated terms, such as hydrogen bonding, as well as a bioinformatically based local structure biasing term, which efficiently takes into account many-body effects that are modulated by the local sequence. When combined with appropriate local or global alignments to choose memories, AWSEM can be used to perform de novo protein structure prediction. Herein we present structure prediction results for a particular choice of local sequence alignment method based on short residue sequences called fragments. We demonstrate the model’s structure prediction capabilities for three levels of global homology between the target sequence and those proteins used for local structure biasing, all of which assume that the structure of the target sequence is not known. When there are no homologs in the database of structures used for local structure biasing, AWSEM calculations produce structural predictions that are somewhat improved compared with prior works using related approaches. The inclusion of a small number of structures from homologous sequences improves structure prediction only marginally but when the fragment search is restricted to only homologous sequences, AWSEM can perform high resolution structure prediction and can be used for kinetics and dynamics studies. PMID:22545654

  16. Bioinformatics and variability in drug response: a protein structural perspective

    PubMed Central

    Lahti, Jennifer L.; Tang, Grace W.; Capriotti, Emidio; Liu, Tianyun; Altman, Russ B.

    2012-01-01

    Marketed drugs frequently perform worse in clinical practice than in the clinical trials on which their approval is based. Many therapeutic compounds are ineffective for a large subpopulation of patients to whom they are prescribed; worse, a significant fraction of patients experience adverse effects more severe than anticipated. The unacceptable risk–benefit profile for many drugs mandates a paradigm shift towards personalized medicine. However, prior to adoption of patient-specific approaches, it is useful to understand the molecular details underlying variable drug response among diverse patient populations. Over the past decade, progress in structural genomics led to an explosion of available three-dimensional structures of drug target proteins while efforts in pharmacogenetics offered insights into polymorphisms correlated with differential therapeutic outcomes. Together these advances provide the opportunity to examine how altered protein structures arising from genetic differences affect protein–drug interactions and, ultimately, drug response. In this review, we first summarize structural characteristics of protein targets and common mechanisms of drug interactions. Next, we describe the impact of coding mutations on protein structures and drug response. Finally, we highlight tools for analysing protein structures and protein–drug interactions and discuss their application for understanding altered drug responses associated with protein structural variants. PMID:22552919

  17. Comparative Bioinformatics Analyses and Profiling of Lysosome-Related Organelle Proteomes

    PubMed Central

    Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy

    2007-01-01

    Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for 7 lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles. PMID:17375895

  18. Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes

    NASA Astrophysics Data System (ADS)

    Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy

    2007-01-01

    Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.

  19. Structural bioinformatics study of PNP from Schistosoma mansoni.

    PubMed

    da Silveira, Nelson José Freitas; Uchôa, Hugo Brandão; Canduri, Fernanda; Pereira, José Henrique; Camera, João Carlos; Basso, Luiz Augusto; Palma, Mário Sergio; Santos, Diógenes Santiago; de Azevedo, Walter Filgueira

    2004-09-10

    The parasite Schistosoma mansoni lacks the de novo pathway for purine biosynthesis and depends on salvage pathways for its purine requirements. Schistosomiasis is endemic in 76 countries and territories and amongst the parasitic diseases ranks second after malaria in terms of social and economic impact and public health importance. The PNP is an attractive target for drug design and it has been submitted to extensive structure-based design. The atomic coordinates of the complex of human PNP with inosine were used as template for starting the modeling of PNP from S. mansoni complexed with inosine. Here we describe the model for the complex SmPNP-inosine and correlate the structure with differences in the affinity for inosine presented by human and S. mansoni PNPs. PMID:15313179

  20. Comparative metagenomic analysis of human gut microbiome composition using two different bioinformatic pipelines.

    PubMed

    D'Argenio, Valeria; Casaburi, Giorgio; Precone, Vincenza; Salvatore, Francesco

    2014-01-01

    Technological advances in next-generation sequencing-based approaches have greatly impacted the analysis of microbial community composition. In particular, 16S rRNA-based methods have been widely used to analyze the whole set of bacteria present in a target environment. As a consequence, several specific bioinformatic pipelines have been developed to manage these data. MetaGenome Rapid Annotation using Subsystem Technology (MG-RAST) and Quantitative Insights Into Microbial Ecology (QIIME) are two freely available tools for metagenomic analyses that have been used in a wide range of studies. Here, we report the comparative analysis of the same dataset with both QIIME and MG-RAST in order to evaluate their accuracy in taxonomic assignment and in diversity analysis. We found that taxonomic assignment was more accurate with QIIME which, at family level, assigned a significantly higher number of reads. Thus, QIIME generated a more accurate BIOM file, which in turn improved the diversity analysis output. Finally, although informatics skills are needed to install QIIME, it offers a wide range of metrics that are useful for downstream applications and, not less important, it is not dependent on server times. PMID:24719854

  1. Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery

    PubMed Central

    Blundell, Tom L; Sibanda, Bancinyane L; Montalvão, Rinaldo Wander; Brewerton, Suzanne; Chelliah, Vijayalakshmi; Worth, Catherine L; Harmer, Nicholas J; Davies, Owen; Burke, David

    2006-01-01

    Impressive progress in genome sequencing, protein expression and high-throughput crystallography and NMR has radically transformed the opportunities to use protein three-dimensional structures to accelerate drug discovery, but the quantity and complexity of the data have ensured a central place for informatics. Structural biology and bioinformatics have assisted in lead optimization and target identification where they have well established roles; they can now contribute to lead discovery, exploiting high-throughput methods of structure determination that provide powerful approaches to screening of fragment binding. PMID:16524830

  2. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    ERIC Educational Resources Information Center

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  3. Sequential and Structural Aspects of Antifungal Peptides from Animals, Bacteria and Fungi Based on Bioinformatics Tools.

    PubMed

    Neelabh; Singh, Karuna; Rani, Jyoti

    2016-06-01

    Emerging drug resistance varieties and hyper-virulent strains of microorganisms have compelled the scientific fraternity to develop more potent and less harmful therapeutics. Antimicrobial peptides could be one of such therapeutics. This review is an attempt to explore antifungal peptides naturally produced by prokaryotes as well as eukaryotes. They are components of innate immune system providing first line of defence against microbial attacks, especially in eukaryotes. The present article concentrates on types, structures, sources and mode of action of gene-encoded antifungal peptides such as mammalian defensins, protegrins, tritrpticins, histatins, lactoferricins, antifungal peptides derived from birds, amphibians, insects, fungi, bacteria and their synthetic analogues such as pexiganan, omiganan, echinocandins and Novexatin. In silico drug designing, a major revolution in the area of therapeutics, facilitates drug development by exploiting different bioinformatics tools. With this view, bioinformatics tools were used to visualize the structural details of antifungal peptides and to predict their level of similarity. Current practices and recent developments in this area have also been discussed briefly. PMID:27060002

  4. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    PubMed Central

    Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun

    2014-01-01

    While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285

  5. Comparative bioinformatics, temporal and spatial expression analyses of Ixodes scapularis organic anion transporting polypeptides

    PubMed Central

    Radulović, Željko; Porter, Lindsay M.; Kim, Tae K.; Mulenga, Albert

    2015-01-01

    Organic anion-transporting polypeptides (Oatps) are an integral part of the detoxification mechanism in vertebrates and invertebrates. These cell surface proteins are involved in mediating the sodium-independent uptake and/or distribution of a broad array of organic amphipathic compounds and xenobiotic drugs. This study describes bioinformatics and biological characterization of 9 Oatp sequences in the Ixodes scapularis genome. These sequences have been annotated on the basis of 12 transmembrane domains, consensus motif D-X-RW-(I,V)-GAWW-X-G-(F,L)-L, and 11 conserved cysteine amino acid residues in the large extracellular loop 5 that characterize the Oatp superfamily. Ixodes scapularis Oatps may regulate non-redundant cross-tick species conserved functions in that they did not cluster as a monolithic group on the phylogeny tree and that they have orthologs in other ticks. Phylogeny clustering patterns also suggest that some tick Oatp sequences transport substrates that are similar to those of body louse, mosquito, eye worm, and filarial worm Oatps. Semi-quantitative RT-PCR analysis demonstrated that all 9 I. scapularis Oatp sequences were expressed during tick feeding. Ixodes scapularis Oatp genes potentially regulate functions during early and/or late-stage tick feeding as revealed by normalized mRNA profiles. Normalized transcript abundance indicates that I. scapularis Oatp genes are strongly expressed in unfed ticks during the first 24 h of feeding and/or at the end of the tick feeding process. Except for 2 I. scapularis Oatps, which were expressed in the salivary glands and ovaries, all other genes were expressed in all tested organs, suggesting the significance of I. scapularis Oatps in maintaining tick homeostasis. Different I. scapularis Oatp mRNA expression patterns were detected and discussed with reference to different physiological states of unfed and feeding ticks. PMID:24582512

  6. An Introductory Bioinformatics Exercise to Reinforce Gene Structure and Expression and Analyze the Relationship between Gene and Protein Sequences

    ERIC Educational Resources Information Center

    Almeida, Craig A.; Tardiff, Daniel F.; De Luca, Jane P.

    2004-01-01

    We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product.…

  7. Determination of Lipid-Protein Interactions in Lung Surfactants Using Computer Simulations and Structural Bioinformatics.

    NASA Astrophysics Data System (ADS)

    Kaznessis, Yiannis

    2001-06-01

    Proteins are the primary components of the networks that conduct the flows of mass, energy and information in living organisms. The discovery of the principles of protein structure and function allows the development of design rules for biological activities. The microscopic nature of the operating mechanisms of protein activity, and the vast complexity of the networks of interaction call for the employment of powerful computational methodologies that can decipher the physicochemical and evolutionary principles underlying protein structure and function. An example will be presented that reflects the strength of computational approaches. Atomistic molecular dynamics simulations and structural bioinformatics tools are employed to investigate the interactions between the first 25 N-terminal residues of surfactant protein B (SP-B 1-25) and the lipid components of the lung surfactant (LS). An understanding of the molecular level interactions between the LS components is essential for the establishment of design rules for the development of synthetic LS and the treatment of the neonatal respiratory distress syndrome, which results from deficiency or inactivation of LS.

  8. Edge Bioinformatics

    Energy Science and Technology Software Center (ESTSC)

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in amore » genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance« less

  9. Edge Bioinformatics

    SciTech Connect

    Lo, Chien-Chi

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance

  10. Structural, bioinformatic, and in vivo analyses of two Treponema pallidum lipoproteins reveal a unique TRAP transporter

    PubMed Central

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-01-01

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP- independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP) and tp0958 (the symporter) are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of “tetratricopeptide repeat” (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPR-protein associated TRAP transporters (TPATs) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s). PMID:22306465

  11. Structural and Phylogenetic Analysis of Laccases from Trichoderma: A Bioinformatic Approach

    PubMed Central

    Cázares-García, Saila Viridiana; Vázquez-Garcidueñas, Ma. Soledad; Vázquez-Marrufo, Gerardo

    2013-01-01

    The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential. PMID:23383142

  12. Structural, Bioinformatic, and In Vivo Analyses of Two Treponema pallidum Lipoproteins Reveal a Unique TRAP Transporter

    SciTech Connect

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-05-25

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP-independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP), and tp0958 (the symporter), are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of 'tetratricopeptide repeat' (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPATs (TPR-protein-associated TRAP-Ts) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s).

  13. A Bioinformatics Approach to the Structure, Function, and Evolution of the Nucleoprotein of the Order Mononegavirales

    PubMed Central

    Cleveland, Sean B.; Davies, John; McClure, Marcella A.

    2011-01-01

    The goal of this Bioinformatic study is to investigate sequence conservation in relation to evolutionary function/structure of the nucleoprotein of the order Mononegavirales. In the combined analysis of 63 representative nucleoprotein (N) sequences from four viral families (Bornaviridae, Filoviridae, Rhabdoviridae, and Paramyxoviridae) we predict the regions of protein disorder, intra-residue contact and co-evolving residues. Correlations between location and conservation of predicted regions illustrate a strong division between families while high- lighting conservation within individual families. These results suggest the conserved regions among the nucleoproteins, specifically within Rhabdoviridae and Paramyxoviradae, but also generally among all members of the order, reflect an evolutionary advantage in maintaining these sites for the viral nucleoprotein as part of the transcription/replication machinery. Results indicate conservation for disorder in the C-terminus region of the representative proteins that is important for interacting with the phosphoprotein and the large subunit polymerase during transcription and replication. Additionally, the C-terminus region of the protein preceding the disordered region, is predicted to be important for interacting with the encapsidated genome. Portions of the N-terminus are responsible for N∶N stability and interactions identified by the presence or lack of co-evolving intra-protein contact predictions. The validation of these prediction results by current structural information illustrates the benefits of the Disorder, Intra-residue contact and Compensatory mutation Correlator (DisICC) pipeline as a method for quickly characterizing proteins and providing the most likely residues and regions necessary to target for disruption in viruses that have little structural information available. PMID:21559282

  14. Structural Bioinformatics-Based Prediction of Exceptional Selectivity of p38 MAP Kinase Inhibitor PH-797804

    SciTech Connect

    Xing, Li; Shieh, Huey S.; Selness, Shaun R.; Devraj, Rajesh V.; Walker, John K.; Devadas, Balekudru; Hope, Heidi R.; Compton, Robert P.; Schindler, John F.; Hirsch, Jeffrey L.; Benson, Alan G.; Kurumbail, Ravi G.; Stegeman, Roderick A.; Williams, Jennifer M.; Broadus, Richard M.; Walden, Zara; Monahan, Joseph B.; Pfizer

    2009-07-24

    PH-797804 is a diarylpyridinone inhibitor of p38{alpha} mitogen-activated protein (MAP) kinase derived from a racemic mixture as the more potent atropisomer (aS), first proposed by molecular modeling and subsequently confirmed by experiments. On the basis of structural comparison with a different biaryl pyrazole template and supported by dozens of high-resolution crystal structures of p38{alpha} inhibitor complexes, PH-797804 is predicted to possess a high level of specificity across the broad human kinase genome. We used a structural bioinformatics approach to identify two selectivity elements encoded by the TXXXG sequence motif on the p38{alpha} kinase hinge: (i) Thr106 that serves as the gatekeeper to the buried hydrophobic pocket occupied by 2,4-difluorophenyl of PH-797804 and (ii) the bidentate hydrogen bonds formed by the pyridinone moiety with the kinase hinge requiring an induced 180{sup o} rotation of the Met109-Gly110 peptide bond. The peptide flip occurs in p38{alpha} kinase due to the critical glycine residue marked by its conformational flexibility. Kinome-wide sequence mining revealed rare presentation of the selectivity motif. Corroboratively, PH-797804 exhibited exceptionally high specificity against MAP kinases and the related kinases. No cross-reactivity was observed in large panels of kinase screens (selectivity ratio of >500-fold). In cellular assays, PH-797804 demonstrated superior potency and selectivity consistent with the biochemical measurements. PH-797804 has met safety criteria in human phase I studies and is under clinical development for several inflammatory conditions. Understanding the rationale for selectivity at the molecular level helps elucidate the biological function and design of specific p38{alpha} kinase inhibitors.

  15. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics.

    PubMed

    Veneman, Wouter J; de Sonneville, Jan; van der Kolk, Kees-Jan; Ordas, Anita; Al-Ars, Zaid; Meijer, Annemarie H; Spaink, Herman P

    2015-03-01

    We present a RNA deep sequencing (RNAseq) analysis of a comparison of the transcriptome responses to infection of zebrafish larvae with Staphylococcus epidermidis and Mycobacterium marinum bacteria. We show how our developed GeneTiles software can improve RNAseq analysis approaches by more confidently identifying a large set of markers upon infection with these bacteria. For analysis of RNAseq data currently, software programs such as Bowtie2 and Samtools are indispensable. However, these programs that are designed for a LINUX environment require some dedicated programming skills and have no options for visualisation of the resulting mapped sequence reads. Especially with large data sets, this makes the analysis time consuming and difficult for non-expert users. We have applied the GeneTiles software to the analysis of previously published and newly obtained RNAseq datasets of our zebrafish infection model, and we have shown the applicability of this approach also to published RNAseq datasets of other organisms by comparing our data with a published mammalian infection study. In addition, we have implemented the DEXSeq module in the GeneTiles software to identify genes, such as glucagon A, that are differentially spliced under infection conditions. In the analysis of our RNAseq data, this has led to the possibility to improve the size of data sets that could be efficiently compared without using problem-dedicated programs, leading to a quick identification of marker sets. Therefore, this approach will also be highly useful for transcriptome analyses of other organisms for which well-characterised genomes are available. PMID:25503064

  16. Bioinformatics investigation of therapeutic mechanisms of Xuesaitong capsule treating ischemic cerebrovascular rat model with comparative transcriptome analysis

    PubMed Central

    Liao, Jiangquan; Wei, Benjun; Chen, Hengwen; Liu, Yongmei; Wang, Jie

    2016-01-01

    Background: Xuesaitong soft capsule (XST) which consists of panax notoginseng saponin (PNS) has been used to treat ischemic cerebrovascular diseases in China. The therapeutic mechanism of XST has not been elucidated yet from prospective of genomics and bioinformatics. Methods: A transcriptome analysis was performed to review series concerning middle cerebral artery occlusion (MCAO) rat model and XST intervention after MCAO from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were compared between blank group and model group, model group and XST group. Functional enrichment and pathway analysis were performed. Protein-Protein interaction network was constructed. The overlapping genes from two DEGs sets were screened out and profound analysis was performed. Results: Two series including 22 samples were obtained. 870 DEGs were identified between blank group and model group, and 1189 DEGs were identified between model group and XST group. GO terms and KEGG pathways of MCAO and XST intervention were significantly enriched. PPI networks were constructed to demonstrate the gene-gene interactions. The overlapping genes from two DEGs sets were highlighted. ANTXR2, FHL3, PRCP, TYROBP, TAF9B, FGFR2, BCL11B, RB1CC1 and MBNL2 were the pivotal genes and possible action sites of XST therapeutic mechanisms. Conclusion: MCAO is a pathological process with multiple. PMID:27347353

  17. Structural templates for comparative protein docking

    PubMed Central

    Anishchenko, Ivan; Kundrotas, Petras J.; Tuzikov, Alexander V.; Vakser, Ilya A.

    2014-01-01

    Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, non-redundant library of templates containing 4,950 full structures of binary complexes and 5,936 protein-protein interfaces extracted from the full structures at 12Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu. PMID:25488330

  18. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  19. Using structural bioinformatics to investigate the impact of non synonymous SNPs and disease mutations: scope and limitations

    PubMed Central

    Reumers, Joke; Schymkowitz, Joost; Rousseau, Fréderic

    2009-01-01

    Background Linking structural effects of mutations to functional outcomes is a major issue in structural bioinformatics, and many tools and studies have shown that specific structural properties such as stability and residue burial can be used to distinguish neutral variations and disease associated mutations. Results We have investigated 39 structural properties on a set of SNPs and disease mutations from the Uniprot Knowledge Base that could be mapped on high quality crystal structures and show that none of these properties can be used as a sole classification criterion to separate the two data sets. Furthermore, we have reviewed the annotation process from mutation to result and identified the liabilities in each step. Conclusion Although excellent annotation results of various research groups underline the great potential of using structural bioinformatics to investigate the mechanisms underlying disease, the interpretation of such annotations cannot always be extrapolated to proteome wide variation studies. Difficulties for large-scale studies can be found both on the technical level, i.e. the scarcity of data and the incompleteness of the structural tool suites, and on the conceptual level, i.e. the correct interpretation of the results in a cellular context. PMID:19758473

  20. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches

    PubMed Central

    Ozyigit, Ibrahim I.; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y.; Koc, Ibrahim; Öztürk, Münir X.; Anjum, Naser A.

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  1. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches.

    PubMed

    Ozyigit, Ibrahim I; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y; Koc, Ibrahim; Öztürk, Münir X; Anjum, Naser A

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  2. Bioinformatics based structural characterization of glucose dehydrogenase (gdh) gene and growth promoting activity of Leclercia sp. QAU-66

    PubMed Central

    Naveed, Muhammad; Ahmed, Iftikhar; Khalid, Nauman; Mumtaz, Abdul Samad

    2014-01-01

    Glucose dehydrogenase (GDH; EC 1.1. 5.2) is the member of quinoproteins group that use the redox cofactor pyrroloquinoline quinoine, calcium ions and glucose as substrate for its activity. In present study, Leclercia sp. QAU-66, isolated from rhizosphere of Vigna mungo, was characterized for phosphate solubilization and the role of GDH in plant growth promotion of Phaseolus vulgaris. The strain QAU-66 had ability to solubilize phosphorus and significantly (p ≤ 0.05) promoted the shoot and root lengths of Phaseolus vulgaris. The structural determination of GDH protein was carried out using bioinformatics tools like Pfam, InterProScan, I-TASSER and COFACTOR. These tools predicted the structural based functional homology of pyrroloquinoline quinone domains in GDH. GDH of Leclercia sp. QAU-66 is one of the main factor that involved in plant growth promotion and provides a solid background for further research in plant growth promoting activities. PMID:25242947

  3. Bioinformatics based structural characterization of glucose dehydrogenase (gdh) gene and growth promoting activity of Leclercia sp. QAU-66.

    PubMed

    Naveed, Muhammad; Ahmed, Iftikhar; Khalid, Nauman; Mumtaz, Abdul Samad

    2014-01-01

    Glucose dehydrogenase (GDH; EC 1.1. 5.2) is the member of quinoproteins group that use the redox cofactor pyrroloquinoline quinoine, calcium ions and glucose as substrate for its activity. In present study, Leclercia sp. QAU-66, isolated from rhizosphere of Vigna mungo, was characterized for phosphate solubilization and the role of GDH in plant growth promotion of Phaseolus vulgaris. The strain QAU-66 had ability to solubilize phosphorus and significantly (p ≤ 0.05) promoted the shoot and root lengths of Phaseolus vulgaris. The structural determination of GDH protein was carried out using bioinformatics tools like Pfam, InterProScan, I-TASSER and COFACTOR. These tools predicted the structural based functional homology of pyrroloquinoline quinone domains in GDH. GDH of Leclercia sp. QAU-66 is one of the main factor that involved in plant growth promotion and provides a solid background for further research in plant growth promoting activities. PMID:25242947

  4. Exploring the immunogenome with bioinformatics.

    PubMed

    de Bono, Bernard; Trowsdale, John

    2003-08-01

    A better description of the immune system can be afforded if the latest developments in bioinformatics are applied to integrate sequence with structure and function. Clear guidelines for the upgrade of the bioinformatic capability of the immunogenetics laboratory are discussed in the light of more powerful methods to detect homology, combined approaches to predict the three dimensional properties of a protein and a robust strategy to represent the biological role of a gene. PMID:14690048

  5. Structural Bioinformatics Inspection of neXtProt PE5 Proteins in the Human Proteome.

    PubMed

    Dong, Qiwen; Menon, Rajasree; Omenn, Gilbert S; Zhang, Yang

    2015-09-01

    One goal of the Human Proteome Project is to identify at least one protein product for each of the ∼20,000 human protein-coding genes. As of October 2014, however, there are 3564 genes (18%) that have no or insufficient evidence of protein existence (PE), as curated by neXtProt; these comprise 2647 PE2-4 missing proteins and 616 PE5 dubious protein entries. We conducted a systematic examination of the 616 PE5 protein entries using cutting-edge protein structure and function modeling methods. Compared to a random sample of high-confidence PE1 proteins, the putative PE5 proteins were found to be over-represented in the membrane and cell surface proteins and peptides fold families. Detailed functional analyses show that most PE5 proteins, if expressed, would belong to transporters and receptors localized in the plasma membrane compartment. The results suggest that experimental difficulty in identifying membrane-bound proteins and peptides could have precluded their detection in mass spectrometry and that special enrichment techniques with improved sensitivity for membrane proteins could be important for the characterization of the PE5 "dark matter" of the human proteome. Finally, we identify 66 high scoring PE5 protein entries and find that six of them were reported in recent mass spectrometry databases; an illustrative annotation of these six is provided. This work illustrates a new approach to examine the potential folding and function of the dubious proteins comprising PE5, which we will next apply to the far larger group of missing proteins comprising PE2-4. PMID:26193931

  6. In the Spotlight: Bioinformatics

    PubMed Central

    Wang, May Dongmei

    2016-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on the optimization of NGS platforms, sequence alignment and assembly algorithms, data analytics, novel algorithms for integrating NGS data with existing genomic, proteomic, or metabolomic data, and quantitative assessment of NGS technology in comparing to more established technologies such as microarrays. NGS technology has been predicated to become a cornerstone of personalized medicine. It is argued that NGS is a promising field for motivated young researchers who are looking for opportunities in bioinformatics. PMID:23192635

  7. Elongation Factor-Tu (EF-Tu) proteins structural stability and bioinformatics in ancestral gene reconstruction

    NASA Astrophysics Data System (ADS)

    Dehipawala, Sunil; Nguyen, A.; Tremberger, G.; Cheung, E.; Schneider, P.; Lieberman, D.; Holden, T.; Cheung, T.

    2013-09-01

    A paleo-experimental evolution report on elongation factor EF-Tu structural stability results has provided an opportunity to rewind the tape of life using the ancestral protein sequence reconstruction modeling approach; consistent with the book of life dogma in current biology and being an important component in the astrobiology community. Fractal dimension via the Higuchi fractal method and Shannon entropy of the DNA sequence classification could be used in a diagram that serves as a simple summary. Results from biomedical gene research provide examples on the diagram methodology. Comparisons between biomedical genes such as EEF2 (elongation factor 2 human, mouse, etc), WDR85 in epigenetics, HAR1 in human specificity, DLG1 in cognitive skill, and HLA-C in mosquito bite immunology with EF Tu DNA sequences have accounted for the reported circular dichroism thermo-stability data systematically; the results also infer a relatively less volatility geologic time period from 2 to 3 Gyr from adaptation viewpoint. Comparison to Thermotoga maritima MSB8 and Psychrobacter shows that Thermus thermophilus HB8 EF-Tu calibration sequence could be an outlier, consistent with free energy calculation by NUPACK. Diagram methodology allows computer simulation studies and HAR1 shows about 0.5% probability from chimp to human in terms of diagram location, and SNP simulation results such as amoebic meningoencephalitis NAF1 suggest correlation. Extensions to the studies of the translation and transcription elongation factor sequences in Megavirus Chiliensis, Megavirus Lba and Pandoravirus show that the studied Pandoravirus sequence could be an outlier with the highest fractal dimension and lowest entropy, as compared to chicken as a deviant in the DNMT3A DNA methylation gene sequences from zebrafish to human and to the less than one percent probability in computer simulation using the HAR1 0.5% probability as reference. The diagram methodology would be useful in ancestral gene

  8. Crowdsourcing for bioinformatics

    PubMed Central

    Good, Benjamin M.; Su, Andrew I.

    2013-01-01

    Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614

  9. Minimal Functional Sites in Metalloproteins and Their Usage in Structural Bioinformatics

    PubMed Central

    Rosato, Antonio; Valasatava, Yana; Andreini, Claudia

    2016-01-01

    Metal ions play a functional role in numerous biochemical processes and cellular pathways. Indeed, about 40% of all enzymes of known 3D structure require a metal ion to be able to perform catalysis. The interactions of the metals with the macromolecular framework determine their chemical properties and reactivity. The relevant interactions involve both the coordination sphere of the metal ion and the more distant interactions of the so-called second sphere, i.e., the non-bonded interactions between the macromolecule and the residues coordinating the metal (metal ligands). The metal ligands and the residues in their close spatial proximity define what we call a minimal functional site (MFS). MFSs can be automatically extracted from the 3D structures of metal-binding biological macromolecules deposited in the Protein Data Bank (PDB). They are 3D templates that describe the local environment around a metal ion or metal cofactor and do not depend on the overall macromolecular structure. MFSs provide a different view on metal-binding proteins and nucleic acids, completely focused on the metal. Here we present different protocols and tools based upon the concept of MFS to obtain deeper insight into the structural and functional properties of metal-binding macromolecules. We also show that structure conservation of MFSs in metalloproteins relates to local sequence similarity more strongly than to overall protein similarity. PMID:27153067

  10. Bioinformatic Screening of Autoimmune Disease Genes and Protein Structure Prediction with FAMS for Drug Discovery

    PubMed Central

    Ishida, Shigeharu; Umeyama, Hideaki; Iwadate, Mitsuo; Y-h, Taguchi

    2014-01-01

    Autoimmune diseases are often intractable because their causes are unknown. Identifying which genes contribute to these diseases may allow us to understand the pathogenesis, but it is difficult to determine which genes contribute to disease. Recently, epigenetic information has been considered to activate/deactivate disease-related genes. Thus, it may also be useful to study epigenetic information that differs between healthy controls and patients with autoimmune disease. Among several types of epigenetic information, promoter methylation is believed to be one of the most important factors. Here, we propose that principal component analysis is useful to identify specific gene promoters that are differently methylated between the normal healthy controls and patients with autoimmune disease. Full Automatic Modeling System (FAMS) was used to predict the three-dimensional structures of selected proteins and successfully inferred relatively confident structures. Several possibilities of the application to the drug discovery based on obtained structures are discussed. PMID:23855671

  11. DOE EPSCoR Initiative in Structural and computational Biology/Bioinformatics

    SciTech Connect

    Wallace, Susan S.

    2008-02-21

    The overall goal of the DOE EPSCoR Initiative in Structural and Computational Biology was to enhance the competiveness of Vermont research in these scientific areas. To develop self-sustaining infrastructure, we increased the critical mass of faculty, developed shared resources that made junior researchers more competitive for federal research grants, implemented programs to train graduate and undergraduate students who participated in these research areas and provided seed money for research projects. During the time period funded by this DOE initiative: (1) four new faculty were recruited to the University of Vermont using DOE resources, three in Computational Biology and one in Structural Biology; (2) technical support was provided for the Computational and Structural Biology facilities; (3) twenty-two graduate students were directly funded by fellowships; (4) fifteen undergraduate students were supported during the summer; and (5) twenty-eight pilot projects were supported. Taken together these dollars resulted in a plethora of published papers, many in high profile journals in the fields and directly impacted competitive extramural funding based on structural or computational biology resulting in 49 million dollars awarded in grants (Appendix I), a 600% return on investment by DOE, the State and University.

  12. Structural and bioinformatic characterization of an Acinetobacter baumannii type II carrier protein

    SciTech Connect

    Allen, C. Leigh; Gulick, Andrew M.

    2014-06-01

    The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented. Microorganisms produce a variety of natural products via secondary metabolic biosynthetic pathways. Two of these types of synthetic systems, the nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), use large modular enzymes containing multiple catalytic domains in a single protein. These multidomain enzymes use an integrated carrier protein domain to transport the growing, covalently bound natural product to the neighboring catalytic domains for each step in the synthesis. Interestingly, some PKS and NRPS clusters contain free-standing domains that interact intermolecularly with other proteins. Being expressed outside the architecture of a multi-domain protein, these so-called type II proteins present challenges to understand the precise role they play. Additional structures of individual and multi-domain components of the NRPS enzymes will therefore provide a better understanding of the features that govern the domain interactions in these interesting enzyme systems. The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented here. Comparison with the closest structural homologs of other carrier proteins identifies the requirements for a conserved glycine residue and additional important sequence and structural requirements within the regions that interact with partner proteins.

  13. Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models.

    PubMed

    Hosseinzadeh, Faezeh; Ebrahimi, Mansour; Goliaei, Bahram; Shamabadi, Narges

    2012-01-01

    Rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important in diagnosis of this disease. Furthermore sequence-derived structural and physicochemical descriptors are very useful for machine learning prediction of protein structural and functional classes, classifying proteins and the prediction performance. Herein, in this study is the classification of lung tumors based on 1497 attributes derived from structural and physicochemical properties of protein sequences (based on genes defined by microarray analysis) investigated through a combination of attribute weighting, supervised and unsupervised clustering algorithms. Eighty percent of the weighting methods selected features such as autocorrelation, dipeptide composition and distribution of hydrophobicity as the most important protein attributes in classification of SCLC, NSCLC and COMMON classes of lung tumors. The same results were observed by most tree induction algorithms while descriptors of hydrophobicity distribution were high in protein sequences COMMON in both groups and distribution of charge in these proteins was very low; showing COMMON proteins were very hydrophobic. Furthermore, compositions of polar dipeptide in SCLC proteins were higher than NSCLC proteins. Some clustering models (alone or in combination with attribute weighting algorithms) were able to nearly classify SCLC and NSCLC proteins. Random Forest tree induction algorithm, calculated on leaves one-out and 10-fold cross validation) shows more than 86% accuracy in clustering and predicting three different lung cancer tumors. Here for the first time the application of data mining tools to effectively classify three classes of lung cancer tumors regarding the importance of dipeptide composition, autocorrelation and distribution descriptor has been reported. PMID:22829872

  14. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15

    PubMed Central

    Wang, Jinlan; Chang, Fen

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein–protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  15. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15.

    PubMed

    Wang, Jinlan; Zhang, Zheng; Chang, Fen; Yin, Deling

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein-protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  16. The CopC Family: Structural and Bioinformatic Insights into a Diverse Group of Periplasmic Copper Binding Proteins.

    PubMed

    Lawton, Thomas J; Kenney, Grace E; Hurley, Joseph D; Rosenzweig, Amy C

    2016-04-19

    The CopC proteins are periplasmic copper binding proteins believed to play a role in bacterial copper homeostasis. Previous studies have focused on CopCs that are part of seven-protein Cop or Pco systems involved in copper resistance. These canonical CopCs contain distinct Cu(I) and Cu(II) binding sites. Mounting evidence suggests that CopCs are more widely distributed, often present only with the CopD inner membrane protein, frequently as a fusion protein, and that the CopC and CopD proteins together function in the uptake of copper to the cytoplasm. In the methanotroph Methylosinus trichosporium OB3b, genes encoding a CopCD pair are located adjacent to the particulate methane monooxygenase (pMMO) operon. The CopC from this organism (Mst-CopC) was expressed, purified, and structurally characterized. The 1.46 Å resolution crystal structure of Mst-CopC reveals a single Cu(II) binding site with coordination somewhat different from that in canonical CopCs, and the absence of a Cu(I) binding site. Extensive bioinformatic analyses indicate that the majority of CopCs in fact contain only a Cu(II) site, with just 10% of sequences corresponding to the canonical two-site CopC. Accordingly, a new classification scheme for CopCs was developed, and detailed analyses of the sequences and their genomic neighborhoods reveal new proteins potentially involved in copper homeostasis, providing a framework for expanded models of CopCD function. PMID:27010565

  17. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  18. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  19. Bioinformatics-Aided Venomics

    PubMed Central

    Kaas, Quentin; Craik, David J.

    2015-01-01

    Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future. PMID:26110505

  20. A bioinformatics approach for integrated transcriptomic and proteomic comparative analyses of model and non-sequenced anopheline vectors of human malaria parasites.

    PubMed

    Ubaida Mohien, Ceereena; Colquhoun, David R; Mathias, Derrick K; Gibbons, John G; Armistead, Jennifer S; Rodriguez, Maria C; Rodriguez, Mario Henry; Edwards, Nathan J; Hartler, Jürgen; Thallinger, Gerhard G; Graham, David R; Martinez-Barnetche, Jesus; Rokas, Antonis; Dinglasan, Rhoel R

    2013-01-01

    Malaria morbidity and mortality caused by both Plasmodium falciparum and Plasmodium vivax extend well beyond the African continent, and although P. vivax causes between 80 and 300 million severe cases each year, vivax transmission remains poorly understood. Plasmodium parasites are transmitted by Anopheles mosquitoes, and the critical site of interaction between parasite and host is at the mosquito's luminal midgut brush border. Although the genome of the "model" African P. falciparum vector, Anopheles gambiae, has been sequenced, evolutionary divergence limits its utility as a reference across anophelines, especially non-sequenced P. vivax vectors such as Anopheles albimanus. Clearly, technologies and platforms that bridge this substantial scientific gap are required in order to provide public health scientists with key transcriptomic and proteomic information that could spur the development of novel interventions to combat this disease. To our knowledge, no approaches have been published that address this issue. To bolster our understanding of P. vivax-An. albimanus midgut interactions, we developed an integrated bioinformatic-hybrid RNA-Seq-LC-MS/MS approach involving An. albimanus transcriptome (15,764 contigs) and luminal midgut subproteome (9,445 proteins) assembly, which, when used with our custom Diptera protein database (685,078 sequences), facilitated a comparative proteomic analysis of the midgut brush borders of two important malaria vectors, An. gambiae and An. albimanus. PMID:23082028

  1. On comparing two structured RNA multiple alignments.

    PubMed

    Patel, Vandanaben; Wang, Jason T L; Setia, Shefali; Verma, Anurag; Warden, Charles D; Zhang, Kaizhong

    2010-12-01

    We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike many other tools that can perform pairwise alignment of either single sequences or structures only, BlockMatch takes into account the characteristics of all the sequences in the blocks along with their consensus structures during the alignment process, thus being able to achieve a high-quality alignment result. We apply BlockMatch to phylogeny reconstruction on a set of 5S rRNA sequences taken from fifteen bacteria species. Experimental results showed that the phylogenetic tree generated by our method is more accurate than the tree constructed based on the widely used ClustalW tool. The BlockMatch algorithm is implemented into a web server, accessible at http://bioinformatics.njit.edu/blockmatch. A jar file of the program is also available for download from the web server. PMID:21121021

  2. Channelrhodopsins: a bioinformatics perspective.

    PubMed

    Del Val, Coral; Royuela-Flor, José; Milenkovic, Stefan; Bondar, Ana-Nicoleta

    2014-05-01

    Channelrhodopsins are microbial-type rhodopsins that function as light-gated cation channels. Understanding how the detailed architecture of the protein governs its dynamics and specificity for ions is important, because it has the potential to assist in designing site-directed channelrhodopsin mutants for specific neurobiology applications. Here we use bioinformatics methods to derive accurate alignments of channelrhodopsin sequences, assess the sequence conservation patterns and find conserved motifs in channelrhodopsins, and use homology modeling to construct three-dimensional structural models of channelrhodopsins. The analyses reveal that helices C and D of channelrhodopsins contain Cys, Ser, and Thr groups that can engage in both intra- and inter-helical hydrogen bonds. We propose that these polar groups participate in inter-helical hydrogen-bonding clusters important for the protein conformational dynamics and for the local water interactions. This article is part of a Special Issue entitled: Retinal Proteins - You can teach an old dog new tricks. PMID:24252597

  3. Bioinformatics and Moonlighting Proteins

    PubMed Central

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein–protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations – it requires the existence of multialigned family protein sequences – but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  4. Bioinformatics and Moonlighting Proteins.

    PubMed

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein-protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations - it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  5. The origins of bioinformatics.

    PubMed

    Hagen, J B

    2000-12-01

    Bioinformatics is often described as being in its infancy, but computers emerged as important tools in molecular biology during the early 1960s. A decade before DNA sequencing became feasible, computational biologists focused on the rapidly accumulating data from protein biochemistry. Without the benefits of super computers or computer networks, these scientists laid important conceptual and technical foundations for bioinformatics today. PMID:11252753

  6. Computational intelligence techniques in bioinformatics.

    PubMed

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included. PMID:23891719

  7. Bioinformatics in protein analysis.

    PubMed

    Persson, B

    2000-01-01

    The chapter gives an overview of bioinformatic techniques of importance in protein analysis. These include database searches, sequence comparisons and structural predictions. Links to useful World Wide Web (WWW) pages are given in relation to each topic. Databases with biological information are reviewed with emphasis on databases for nucleotide sequences (EMBL, GenBank, DDBJ), genomes, amino acid sequences (Swissprot, PIR, TrEMBL, GenePept), and three-dimensional structures (PDB). Integrated user interfaces for databases (SRS and Entrez) are described. An introduction to databases of sequence patterns and protein families is also given (Prosite, Pfam, Blocks). Furthermore, the chapter describes the widespread methods for sequence comparisons, FASTA and BLAST, and the corresponding WWW services. The techniques involving multiple sequence alignments are also reviewed: alignment creation with the Clustal programs, phylogenetic tree calculation with the Clustal or Phylip packages and tree display using Drawtree, njplot or phylo_win. Finally, the chapter also treats the issue of structural prediction. Different methods for secondary structure predictions are described (Chou-Fasman, Garnier-Osguthorpe-Robson, Predator, PHD). Techniques for predicting membrane proteins, antigenic sites and postranslational modifications are also reviewed. PMID:10803381

  8. Bioinformatics Visualisation Tools: An Unbalanced Picture.

    PubMed

    Broască, Laura; Ancuşa, Versavia; Ciocârlie, Horia

    2016-01-01

    Visualization tools represent a key element in triggering human creativity while being supported with the analysis power of the machine. This paper analyzes free network visualization tools for bioinformatics, frames them in domain specific requirements and compares them. PMID:27577488

  9. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    PubMed

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-01

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. PMID:27318307

  10. Global computing for bioinformatics.

    PubMed

    Loewe, Laurence

    2002-12-01

    Global computing, the collaboration of idle PCs via the Internet in a SETI@home style, emerges as a new way of massive parallel multiprocessing with potentially enormous CPU power. Its relations to the broader, fast-moving field of Grid computing are discussed without attempting a review of the latter. This review (i) includes a short table of milestones in global computing history, (ii) lists opportunities global computing offers for bioinformatics, (iii) describes the structure of problems well suited for such an approach, (iv) analyses the anatomy of successful projects and (v) points to existing software frameworks. Finally, an evaluation of the various costs shows that global computing indeed has merit, if the problem to be solved is already coded appropriately and a suitable global computing framework can be found. Then, either significant amounts of computing power can be recruited from the general public, or--if employed in an enterprise-wide Intranet for security reasons--idle desktop PCs can substitute for an expensive dedicated cluster. PMID:12511066

  11. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  12. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  13. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    PubMed Central

    Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  14. A survey of scholarly literature describing the field of bioinformatics education and bioinformatics educational research.

    PubMed

    Magana, Alejandra J; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students' attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  15. Comparative Protein Structure Modeling Using Modeller

    PubMed Central

    Eswar, Narayanan; Marti-Renom, Marc A.; Madhusudhan, M.S.; Eramian, David; Shen, Min-yi; Pieper, Ursula

    2014-01-01

    Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. PMID:18428767

  16. Current trends in antimicrobial agent research: chemo- and bioinformatics approaches.

    PubMed

    Hammami, Riadh; Fliss, Ismail

    2010-07-01

    Databases and chemo- and bioinformatics tools that contain genomic, proteomic and functional information have become indispensable for antimicrobial drug research. The combination of chemoinformatics tools, bioinformatics tools and relational databases provides means of analyzing, linking and comparing online search results. The development of computational tools feeds on a diversity of disciplines, including mathematics, statistics, computer science, information technology and molecular biology. The computational approach to antimicrobial agent discovery and design encompasses genomics, molecular simulation and dynamics, molecular docking, structural and/or functional class prediction, and quantitative structure-activity relationships. This article reviews progress in the development of computational methods, tools and databases used for organizing and extracting biological meaning from antimicrobial research. PMID:20546918

  17. An Online Bioinformatics Curriculum

    PubMed Central

    Searls, David B.

    2012-01-01

    Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field. PMID:23028269

  18. Bioinformatics and School Biology

    ERIC Educational Resources Information Center

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  19. Bioinformatic analysis reveals an evolutional selection for DNA:RNA hybrid G-quadruplex structures as putative transcription regulatory elements in warm-blooded animals.

    PubMed

    Xiao, Shan; Zhang, Jia-Yu; Zheng, Ke-Wei; Hao, Yu-Hua; Tan, Zheng

    2013-12-01

    Recently, we reported the co-transcriptional formation of DNA:RNA hybrid G-quadruplex (HQ) structure by the non-template DNA strand and nascent RNA transcript, which in turn modulates transcription under both in vitro and in vivo conditions. Here we present bioinformatic analysis on putative HQ-forming sequences (PHQS) in the genomes of eukaryotic organisms. Starting from amphibian, PHQS motifs are concentrated in the immediate 1000-nt region downstream of transcription start sites, implying their potential role in transcription regulation. Moreover, their occurrence shows a strong bias toward the non-template versus the template strand. PHQS has become constitutional in genes in warm-blooded animals, and the magnitude of the strand bias correlates with the ability of PHQS to form HQ, suggesting a selection based on HQ formation. This strand bias is reversed in lower species, implying that the selection of PHQS/HQ depended on the living temperature of the organisms. In comparison with the putative intramolecular G-quadruplex-forming sequences (PQS), PHQS motifs are far more prevalent and abundant in the transcribed regions, making them the dominant candidates in the formation of G-quadruplexes in transcription. Collectively, these results suggest that the HQ structures are evolutionally selected to function in transcription and other transcription-mediated processes that involve guanine-rich non-template strand. PMID:23999096

  20. Towards a career in bioinformatics

    PubMed Central

    2009-01-01

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508

  1. Bioinformatics for Exploration

    NASA Technical Reports Server (NTRS)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  2. Bioinformatics pipeline for functional identification and characterization of proteins

    NASA Astrophysics Data System (ADS)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; Krzywkowski, Tomasz; Świerkula, Katarzyna; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    The new sequencing methods, called Next Generation Sequencing gives an opportunity to possess a vast amount of data in short time. This data requires structural and functional annotation. Functional identification and characterization of predicted proteins could be done by in silico approches, thanks to a numerous computational tools available nowadays. However, there is a need to confirm the results of proteins function prediction using different programs and comparing the results or confirm experimentally. Here we present a bioinformatics pipeline for structural and functional annotation of proteins.

  3. An Inquiry into Protein Structure and Genetic Disease: Introducing Undergraduates to Bioinformatics in a Large Introductory Course

    ERIC Educational Resources Information Center

    Bednarski, April E.; Elgin, Sarah C. R.; Pakrasi, Himadri B.

    2005-01-01

    This inquiry-based lab is designed around genetic diseases with a focus on protein structure and function. To allow students to work on their own investigatory projects, 10 projects on 10 different proteins were developed. Students are grouped in sections of 20 and work in pairs on each of the projects. To begin their investigation, students are…

  4. BIOINFORMATIC INTEGRATION OF STRUCTURAL AND FUNCTIONAL GENOMICS DATA ACROSS SPECIES TO DEVELOP PORCINE INFLAMMATORY GENE REGULATORY PATHWAY INFORMATION

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Integration of structural and functional genomic data across species holds great promise in finding genes controlling disease resistance. We are investigating the porcine gut immune response to infection through gene expression profiling. We have collected porcine Affymetrix GeneChip data from RNA ...

  5. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  6. A Guide to Bioinformatics for Immunologists

    PubMed Central

    Whelan, Fiona J.; Yap, Nicholas V. L.; Surette, Michael G.; Golding, G. Brian; Bowdish, Dawn M. E.

    2013-01-01

    Bioinformatics includes a suite of methods, which are cheap, approachable, and many of which are easily accessible without any sort of specialized bioinformatic training. Yet, despite this, bioinformatic tools are under-utilized by immunologists. Herein, we review a representative set of publicly available, easy-to-use bioinformatic tools using our own research on an under-annotated human gene, SCARA3, as an example. SCARA3 shares an evolutionary relationship with the class A scavenger receptors, but preliminary research showed that it was divergent enough that its function remained unclear. In our quest for more information about this gene – did it share gene sequence similarities to other scavenger receptors? Did it contain conserved protein domains? Where was it expressed in the human body? – we discovered the power and informative potential of publicly available bioinformatic tools designed for the novice in mind, which allowed us to hypothesize on the regulation, structure, and function of this protein. We argue that these tools are largely applicable to many facets of immunology research. PMID:24363654

  7. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants.

    PubMed

    Huang, Ming-Der; Huang, Anthony H C

    2015-09-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins. PMID:26232488

  8. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants1[OPEN

    PubMed Central

    Huang, Ming-Der; Huang, Anthony H.C.

    2015-01-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins. PMID:26232488

  9. A Bioinformatics Reference Model: Towards a Framework for Developing and Organising Bioinformatic Resources

    NASA Astrophysics Data System (ADS)

    Hiew, Hong Liang; Bellgard, Matthew

    2007-11-01

    Life Science research faces the constant challenge of how to effectively handle an ever-growing body of bioinformatics software and online resources. The users and developers of bioinformatics resources have a diverse set of competing demands on how these resources need to be developed and organised. Unfortunately, there does not exist an adequate community-wide framework to integrate such competing demands. The problems that arise from this include unstructured standards development, the emergence of tools that do not meet specific needs of researchers, and often times a communications gap between those who use the tools and those who supply them. This paper presents an overview of the different functions and needs of bioinformatics stakeholders to determine what may be required in a community-wide framework. A Bioinformatics Reference Model is proposed as a basis for such a framework. The reference model outlines the functional relationship between research usage and technical aspects of bioinformatics resources. It separates important functions into multiple structured layers, clarifies how they relate to each other, and highlights the gaps that need to be addressed for progress towards a diverse, manageable, and sustainable body of resources. The relevance of this reference model to the bioscience research community, and its implications in progress for organising our bioinformatics resources, are discussed.

  10. Bioinformatics and genomic medicine.

    PubMed

    Kim, Ju Han

    2002-01-01

    Bioinformatics is a rapidly emerging field of biomedical research. A flood of large-scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational science. Clinical informatics has long developed methodologies to improve biomedical research and clinical care by integrating experimental and clinical information systems. The informatics revolution in both bioinformatics and clinical informatics will eventually change the current practice of medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high-throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever, in much the same way that biochemistry did a generation ago. This paper describes how these technologies will impact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics and proteomics. Basic data preprocessing with normalization and filtering, primary pattern analysis, and machine-learning algorithms are discussed. Use of integrative biochip informatics technologies, including multivariate data projection, gene-metabolic pathway mapping, automated biomolecular annotation, text mining of factual and literature databases, and the integrated management of biomolecular databases, are also discussed. PMID:12544491

  11. Pattern recognition in bioinformatics.

    PubMed

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained. PMID:23559637

  12. Improvement of Student Understanding of How Kinetic Data Facilitates the Determination of Amino Acid Catalytic Function through an Alkaline Phosphatase Structure/Mechanism Bioinformatics Exercise

    ERIC Educational Resources Information Center

    Grunwald, Sandra K.; Krueger, Katherine J.

    2008-01-01

    Laboratory exercises, which utilize alkaline phosphatase as a model enzyme, have been developed and used extensively in undergraduate biochemistry courses to illustrate enzyme steady-state kinetics. A bioinformatics laboratory exercise for the biochemistry laboratory, which complements the traditional alkaline phosphatase kinetics exercise, was…

  13. LXtoo: an integrated live Linux distribution for the bioinformatics community

    PubMed Central

    2012-01-01

    Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356

  14. A novel method to compare protein structures using local descriptors

    PubMed Central

    2011-01-01

    Background Protein structure comparison is one of the most widely performed tasks in bioinformatics. However, currently used methods have problems with the so-called "difficult similarities", including considerable shifts and distortions of structure, sequential swaps and circular permutations. There is a demand for efficient and automated systems capable of overcoming these difficulties, which may lead to the discovery of previously unknown structural relationships. Results We present a novel method for protein structure comparison based on the formalism of local descriptors of protein structure - DEscriptor Defined Alignment (DEDAL). Local similarities identified by pairs of similar descriptors are extended into global structural alignments. We demonstrate the method's capability by aligning structures in difficult benchmark sets: curated alignments in the SISYPHUS database, as well as SISY and RIPC sets, including non-sequential and non-rigid-body alignments. On the most difficult RIPC set of sequence alignment pairs the method achieves an accuracy of 77% (the second best method tested achieves 60% accuracy). Conclusions DEDAL is fast enough to be used in whole proteome applications, and by lowering the threshold of detectable structure similarity it may shed additional light on molecular evolution processes. It is well suited to improving automatic classification of structure domains, helping analyze protein fold space, or to improving protein classification schemes. DEDAL is available online at http://bioexploratorium.pl/EP/DEDAL. PMID:21849047

  15. Fold assessment for comparative protein structure modeling.

    PubMed

    Melo, Francisco; Sali, Andrej

    2007-11-01

    Accurate and automated assessment of both geometrical errors and incompleteness of comparative protein structure models is necessary for an adequate use of the models. Here, we describe a composite score for discriminating between models with the correct and incorrect fold. To find an accurate composite score, we designed and applied a genetic algorithm method that searched for a most informative subset of 21 input model features as well as their optimized nonlinear transformation into the composite score. The 21 input features included various statistical potential scores, stereochemistry quality descriptors, sequence alignment scores, geometrical descriptors, and measures of protein packing. The optimized composite score was found to depend on (1) a statistical potential z-score for residue accessibilities and distances, (2) model compactness, and (3) percentage sequence identity of the alignment used to build the model. The accuracy of the composite score was compared with the accuracy of assessment by single and combined features as well as by other commonly used assessment methods. The testing set was representative of models produced by automated comparative modeling on a genomic scale. The composite score performed better than any other tested score in terms of the maximum correct classification rate (i.e., 3.3% false positives and 2.5% false negatives) as well as the sensitivity and specificity across the whole range of thresholds. The composite score was implemented in our program MODELLER-8 and was used to assess models in the MODBASE database that contains comparative models for domains in approximately 1.3 million protein sequences. PMID:17905832

  16. Virtual Bioinformatics Distance Learning Suite

    ERIC Educational Resources Information Center

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  17. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  18. Chapter 16: Text Mining for Translational Bioinformatics

    PubMed Central

    Cohen, K. Bretonnel; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research—translating basic science results into new interventions—and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  19. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    ERIC Educational Resources Information Center

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  20. [Comparative hierarchic structure of the genetic language].

    PubMed

    Ratner, V A

    1993-05-01

    The genetical texts and genetic language are built according to hierarchic principle and contain no less than 6 levels of coding sequences, separated by marks of punctuation, separation and indication: codons, cistrons, scriptons, replicons, linkage groups, genomes. Each level has all the attributes of the language. This hierarchic system expresses some general properties and regularities. The rules of genetic language being determined, the variability of genetical texts is generated by block-modular combinatorics on each level. Between levels there are some intermediate sublevels and module types capable of being combined. The genetic language is compared with two different independent linguistic systems: human natural languages and artificial programming languages. Genetic language is a natural one by its origin, but it is a typical technical language of the functioning genetic regulatory system--by its predestination. All three linguistic systems under comparison have evident similarity of the organization principles and hierarchical structures. This argues for similarity of their principles of appearance and evolution. PMID:8335232

  1. Microbial bioinformatics 2020.

    PubMed

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  2. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    PubMed Central

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  3. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    PubMed

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  4. Bioinformatic pipelines in Python with Leaf

    PubMed Central

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315

  5. Comparative Proteomic and Bioinformatic Analysis of the Effects of a High-Grain Diet on the Hepatic Metabolism in Lactating Dairy Goats

    PubMed Central

    Jiang, Xueyuan; Zeng, Tao; Zhang, Shukun; Zhang, Yuanshu

    2013-01-01

    To gain insight on the impart of high-grain diets on liver metabolism in ruminants, we employed a comparative proteomic approach to investigate the proteome-wide effects of diet in lactating dairy goats by conducting a proteomic analysis of the liver extracts of 10 lactating goats fed either a control diet or a high-grain diet. More than 500 protein spots were detected per condition by two-dimensional electrophoresis (2-DE). In total, 52 differentially expressed spots (≥2.0-fold changed) were excised and analyzed using MALDI TOF/TOF. Fifty-one protein spots were successfully identified. Of these, 29 proteins were upregulated, while 22 were downregulated in the high-grain fed vs. control animals. Differential expressions of proteins including alpha enolase, elongation factor 2, calreticulin, cytochrome b5, apolipoprotein A-I, catalase, was verified by mRNA analysis and/or Western blotting. Database searches combined with Gene Ontology (GO) analysis and KEGG pathway analysis revealed that the high-grain diet resulted in altered expression of proteins related to amino acids metabolism. These results suggest new candidate proteins that may contribute to a better understanding of the signaling pathways and mechanisms that mediate liver adaptation to high-grain diet. PMID:24260456

  6. Comparing Factor Structures of Adolescent Psychopathology

    ERIC Educational Resources Information Center

    Verona, Edelyn; Javdani, Shabnam; Sprague, Jenessa

    2011-01-01

    Research on the structure of adolescent psychopathology can provide information on broad factors that underlie different forms of maladjustment in youths. Multiple studies from the literature on adult populations suggest that 2 factors, Internalizing and Externalizing, meaningfully comprise the factor structure of adult psychopathology (e.g.,…

  7. Comparative BioInformatics and Computational Toxicology

    EPA Science Inventory

    Reflecting the numerous changes in the field since the publication of the previous edition, this third edition of Developmental Toxicology focuses on the mechanisms of developmental toxicity and incorporates current technologies for testing in the risk assessment process.

  8. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  9. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  10. Uncertainty of Comparative Judgments and Multidimensional Structure

    ERIC Educational Resources Information Center

    Sjoberg, Lennart

    1975-01-01

    An analysis of preferences with respect to silhouette drawings of nude females is presented. Systematic intransitivities were discovered. The dispersions of differences (comparatal dispersons) were shown to reflect the multidimensional structure of the stimuli, a finding expected on the basis of prior work. (Author)

  11. Reactance, Restoration, and Cognitive Structure: Comparative Statics

    ERIC Educational Resources Information Center

    Bessarabova, Elena; Fink, Edward L.; Turner, Monique

    2013-01-01

    This study (N = 143) examined the effects of freedom threat on cognitive structures, using recycling as its topic. The results of a 2(Freedom Threat: low vs. high) x 2(Postscript: restoration vs. filler) plus 1(Control) experiment indicated that, relative to the control condition, high freedom threat created a boomerang effect for the targeted…

  12. Intrageneric primer design: Bringing bioinformatics tools to the class.

    PubMed

    Lima, André O S; Garcês, Sérgio P S

    2006-09-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private and academic) with a need for bachelor of science students with bioinformatics skills. In consideration of this need, described here is a problem-based class in which students are asked to design a set of intrageneric primers for PCR. The exercise is divided into five classes of 1 h each, in which students use freeware bioinformatics tools and data bases available through the Internet. Besides designing the set of primers, the students will consequently learn the significance and use of the major bioinformatics procedures, such as searching a data base, conducting and analyzing sequence multialignment, comparing sequences with a data base, and selecting primers. PMID:21638710

  13. Compare, Contrast, Comprehend: Using Compare-Contrast Text Structures with ELLs in K-3 Classrooms

    ERIC Educational Resources Information Center

    Dreher, Mariam Jean; Gray, Jennifer Letcher

    2009-01-01

    In this article, we describe how to help primary-grade English language learners use compare-contrast text structures. Specifically, we explain (a) how to teach students to identify the compare-contrast text structure, and to use this structure to support their comprehension, (b) how to use compare-contrast texts to activate and extend students'…

  14. Adapting bioinformatics curricula for big data.

    PubMed

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  15. Adapting bioinformatics curricula for big data

    PubMed Central

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  16. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides.

    PubMed

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe; Pupin, Maude

    2016-01-01

    This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two-dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i.e., antibiotics against multi-resistant pathogens or anti-tumors). PMID:26831711

  17. Bioinformatics Approach in Plant Genomic Research.

    PubMed

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  18. Which craft is best in bioinformatics?

    PubMed

    Attwood, T K; Miller, C J

    2001-07-01

    'Silicon-based' biology has gathered momentum as the world-wide sequencing projects have made possible the investigation and comparative analysis of complete genomes. Central to the quest to elucidate and characterise the genes and gene products encoded within genomes are pivotal concepts concerning the processes of evolution, the mechanisms of protein folding, and, crucially, the manifestation of protein function. Our use of computers to model such concepts is limited by, and must be placed in the context of, the current limits of our understanding of these biological processes. It is important to recognise that we do not have a common understanding of what constitutes a gene; we cannot invariably say that a particular sequence or fold has arisen via divergence or convergence; we do not fully understand the rules of protein folding, so we cannot predict protein structure; and we cannot invariably diagnose protein function, given knowledge only of its sequence or structure in isolation. Accepting what we cannot do with computers plays an essential role in forming an appreciation of what we can do. Without this understanding, it is easy to be misled, as spurious arguments are often used to promote over-enthusiastic notions of what particular programs can achieve. There are valuable lessons to be learned here from the field of artificial intelligence, principal among which is the realisation that capturing and representing complex knowledge is time consuming, expensive and hard. If bioinformatics is to tackle biological complexity meaningfully, the road ahead must therefore be paved with caution, rigour and pragmatism. PMID:11459349

  19. Generations of interdisciplinarity in bioinformatics

    PubMed Central

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  20. Bioinformatics and the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  1. Reproducible Bioinformatics Research for Biologists

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  2. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  3. Clinical Bioinformatics: challenges and opportunities

    PubMed Central

    2012-01-01

    Background Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of -omics information. Conclusions Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput "-omics" technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research. PMID:23095472

  4. Teaching bioinformatics in concert.

    PubMed

    Goodman, Anya L; Dekhtyar, Alex

    2014-11-01

    Can biology students without programming skills solve problems that require computational solutions? They can if they learn to cooperate effectively with computer science students. The goal of the in-concert teaching approach is to introduce biology students to computational thinking by engaging them in collaborative projects structured around the software development process. Our approach emphasizes development of interdisciplinary communication and collaboration skills for both life science and computer science students. PMID:25411792

  5. Bioinformatics and molecular modeling in glycobiology

    PubMed Central

    Schloissnig, Siegfried

    2010-01-01

    The field of glycobiology is concerned with the study of the structure, properties, and biological functions of the family of biomolecules called carbohydrates. Bioinformatics for glycobiology is a particularly challenging field, because carbohydrates exhibit a high structural diversity and their chains are often branched. Significant improvements in experimental analytical methods over recent years have led to a tremendous increase in the amount of carbohydrate structure data generated. Consequently, the availability of databases and tools to store, retrieve and analyze these data in an efficient way is of fundamental importance to progress in glycobiology. In this review, the various graphical representations and sequence formats of carbohydrates are introduced, and an overview of newly developed databases, the latest developments in sequence alignment and data mining, and tools to support experimental glycan analysis are presented. Finally, the field of structural glycoinformatics and molecular modeling of carbohydrates, glycoproteins, and protein–carbohydrate interaction are reviewed. PMID:20364395

  6. Provenance in bioinformatics workflows

    PubMed Central

    2013-01-01

    In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can access details of one particular execution of a workflow, compare results produced by different executions, and plan new experiments more efficiently. In addition to this, a provenance simulator was created, which facilitates the inclusion of provenance data of one genome project workflow execution. Finally, we discuss one case study, which aims to identify genes involved in specific metabolic pathways of Bacillus cereus, as well as to compare this isolate with other phylogenetic related bacteria from the Bacillus group. B. cereus is an extremophilic bacteria, collected in warm water in the Midwestern Region of Brazil, its DNA samples having been sequenced with an NGS machine. PMID:24564294

  7. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    PubMed Central

    Obom, Kristina M.; Cummings, Patrick J.

    2007-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation. PMID:23653816

  8. Bioinformatics in the information age

    SciTech Connect

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  9. Detecting evolution of bioinformatics with a content and co-authorship analysis.

    PubMed

    Song, Min; Yang, Christopher C; Tang, Xuning

    2013-12-01

    Bioinformatics is an interdisciplinary research field that applies advanced computational techniques to biological data. Bibliometrics analysis has recently been adopted to understand the knowledge structure of a research field by citation pattern. In this paper, we explore the knowledge structure of Bioinformatics from the perspective of a core open access Bioinformatics journal, BMC Bioinformatics with trend analysis, the content and co-authorship network similarity, and principal component analysis. Publications in four core journals including Bioinformatics - Oxford Journal and four conferences in Bioinformatics were harvested from DBLP. After converting publications into TF-IDF term vectors, we calculate the content similarity, and we also calculate the social network similarity based on the co-authorship network by utilizing the overlap measure between two co-authorship networks. Key terms is extracted and analyzed with PCA, visualization of the co-authorship network is conducted. The experimental results show that Bioinformatics is fast-growing, dynamic and diversified. The content analysis shows that there is an increasing overlap among Bioinformatics journals in terms of topics and more research groups participate in researching Bioinformatics according to the co-authorship network similarity. PMID:23710427

  10. Receptor-binding sites: bioinformatic approaches.

    PubMed

    Flower, Darren R

    2006-01-01

    It is increasingly clear that both transient and long-lasting interactions between biomacromolecules and their molecular partners are the most fundamental of all biological mechanisms and lie at the conceptual heart of protein function. In particular, the protein-binding site is the most fascinating and important mechanistic arbiter of protein function. In this review, I examine the nature of protein-binding sites found in both ligand-binding receptors and substrate-binding enzymes. I highlight two important concepts underlying the identification and analysis of binding sites. The first is based on knowledge: when one knows the location of a binding site in one protein, one can "inherit" the site from one protein to another. The second approach involves the a priori prediction of a binding site from a sequence or a structure. The full and complete analysis of binding sites will necessarily involve the full range of informatic techniques ranging from sequence-based bioinformatic analysis through structural bioinformatics to computational chemistry and molecular physics. Integration of both diverse experimental and diverse theoretical approaches is thus a mandatory requirement in the evaluation of binding sites and the binding events that occur within them. PMID:16671408

  11. A Bioinformatics Facility for NASA

    NASA Technical Reports Server (NTRS)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  12. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  13. Bioinformatic Insights from Metagenomics through Visualization

    SciTech Connect

    Havre, Susan L.; Webb-Robertson, Bobbie-Jo M.; Shah, Anuj; Posse, Christian; Gopalan, Banu; Brockman, Fred J.

    2005-08-10

    Revised abstract: (remove current and replace with this) Cutting-edge biological and bioinformatics research seeks a systems perspective through the analysis of multiple types of high-throughput and other experimental data for the same sample. Systems-level analysis requires the integration and fusion of such data, typically through advanced statistics and mathematics. Visualization is a complementary com-putational approach that supports integration and analysis of complex data or its derivatives. We present a bioinformatics visualization prototype, Juxter, which depicts categorical information derived from or assigned to these diverse data for the purpose of comparing patterns across categorizations. The visualization allows users to easily discern correlated and anomalous patterns in the data. These patterns, which might not be detected automatically by algorithms, may reveal valuable information leading to insight and discovery. We describe the visualization and interaction capabilities and demonstrate its utility in a new field, metagenomics, which combines molecular biology and genetics to identify and characterize genetic material from multi-species microbial samples.

  14. Bioinformatics analysis of the epitope regions for norovirus capsid protein

    PubMed Central

    2013-01-01

    Background Norovirus is the major cause of nonbacterial epidemic gastroenteritis, being highly prevalent in both developing and developed countries. Despite of the available monoclonal antibodies (MAbs) for different sub-genogroups, a comprehensive epitope analysis based on various bioinformatics technology is highly desired for future potential antibody development in clinical diagonosis and treatment. Methods A total of 18 full-length human norovirus capsid protein sequences were downloaded from GenBank. Protein modeling was performed with program Modeller 9.9. The modeled 3D structures of capsid protein of norovirus were submitted to the protein antigen spatial epitope prediction webserver (SEPPA) for predicting the possible spatial epitopes with the default threshold. The results were processed using the Biosoftware. Results Compared with GI, we found that the GII genogroup had four deletions and two special insertions in the VP1 region. The predicted conformational epitope regions mainly concentrated on N-terminal (1~96), Middle Part (298~305, 355~375) and C-terminal (560~570). We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. Conclusions The predicted conformational epitope regions of norovirus VP1 mainly concentrated on N-terminal, Middle Part and C-terminal. We find two common epitope regions on sequences for GI and GII genogroup, and also found an exclusive epitope region for GII genogroup. The overlapping with experimental epitopes indicates the important role of latest computational technologies. With the fast development of computational immunology tools, the bioinformatics pipeline will be more and more critical to vaccine design. PMID:23514273

  15. Bioinformatics in Africa: The Rise of Ghana?

    PubMed

    Karikari, Thomas K

    2015-09-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  16. Bioinformatics in Africa: The Rise of Ghana?

    PubMed Central

    Karikari, Thomas K.

    2015-01-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  17. Comparative modeling of InP solar cell structures

    NASA Technical Reports Server (NTRS)

    Jain, R. K.; Weinberg, I.; Flood, D. J.

    1991-01-01

    The comparative modeling of p(+)n and n(+)p indium phosphide solar cell structures is studied using a numerical program PC-1D. The optimal design study has predicted that the p(+)n structure offers improved cell efficiencies as compared to n(+)p structure, due to higher open-circuit voltage. The various cell material and process parameters to achieve the maximum cell efficiencies are reported. The effect of some of the cell parameters on InP cell I-V characteristics was studied. The available radiation resistance data on n(+)p and p(+)p InP solar cells are also critically discussed.

  18. Structural analysis of the PsbQ protein of photosystem II by Fourier transform infrared and circular dichroic spectroscopy and by bioinformatic methods.

    PubMed

    Balsera, Mónica; Arellano, Juan B; Gutiérrez, José R; Heredia, Pedro; Revuelta, José L; De Las Rivas, Javier

    2003-02-01

    The structure of PsbQ, one of the three main extrinsic proteins associated with the oxygen-evolving complex (OEC) of higher plants and green algae, is examined by Fourier transform infrared (FTIR) and circular dichroic (CD) spectroscopy and by computational structural prediction methods. This protein, together with two other lumenally bound extrinsic proteins, PsbO and PsbP, is essential for the stability and full activity of the OEC in plants. The FTIR spectra obtained in both H(2)O and D(2)O suggest a mainly alpha-helix structure on the basis of the relative areas of the constituents of the amide I and I' bands. The FTIR quantitative analyses indicate that PsbQ contains about 53% alpha-helix, 7% turns, 14% nonordered structure, and 24% beta-strand plus other beta-type extended structures. CD analyses indicate that PsbQ is a mainly alpha-helix protein (about 64%), presenting a small percentage assigned to beta-strand ( approximately 7%) and a larger amount assigned to turns and nonregular structures ( approximately 29%). Independent of the spectroscopic analyses, computational methods for protein structure prediction of PsbQ were utilized. First, a multiple alignment of 12 sequences of PsbQ was obtained after an extensive search in the public databases for protein and EST sequences. Based on this alignment, computational prediction of the secondary structure and the solvent accessibility suggest the presence of two different structural domains in PsbQ: a major C-terminal domain containing four alpha-helices and a minor N-terminal domain with a poorly defined secondary structure enriched in proline and glycine residues. The search for PsbQ analogues by fold recognition methods, not based on the secondary structure, also indicates that PsbQ is a four alpha-helix protein, most probably folding as an up-down bundle. The results obtained by both the spectroscopic and computational methods are in agreement, all indicating that PsbQ is mainly an alpha protein, and show

  19. DSSTOX STRUCTURE-SEARCHABLE PUBLIC TOXICITY DATABASE NETWORK: CURRENT PROGRESS AND NEW INITIATIVES TO IMPROVE CHEMO-BIOINFORMATICS CAPABILITIES

    EPA Science Inventory

    The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...

  20. Transregional zones of concentrated deformation: Structure, evolution, and comparative geodynamics

    NASA Astrophysics Data System (ADS)

    Leonov, M. G.

    2016-03-01

    The comparative tectonic characterization of transregional linear structures (zones of concentrated deformations) is given for the Pieniny Klippen Belt, the Main Mongolian Lineament, and the transregional Alpine Fault Zone. They represent significant geodynamic elements of the Earth's crust, which separate large crustal segments and reflect their interaction in time and space. The main features of the structure, evolution, and geodynamics inherent to zones of concentrated deformations are described. It is shown that the similarity of their outlines, morphology, internal structure, and kinematic features is combined with a clearly distinct structural position, set of rock associations, formation mechanism, and their role in the origin of mobile belts.

  1. The European Bioinformatics Institute's data resources 2014.

    PubMed

    Brooksbank, Catherine; Bergman, Mary Todd; Apweiler, Rolf; Birney, Ewan; Thornton, Janet

    2014-01-01

    Molecular Biology has been at the heart of the 'big data' revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff's 'Atlas of Protein Sequence and Structure' through the Human Genome Project in the late 1990s and early 2000s to today's population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI's database collection to complement the reviews of individual databases provided elsewhere in this issue. PMID:24271396

  2. Bioinformatics by Example: From Sequence to Target

    NASA Astrophysics Data System (ADS)

    Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

    2002-12-01

    With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.

  3. Bioinformatics for Next Generation Sequencing Data

    PubMed Central

    Magi, Alberto; Benelli, Matteo; Gozzini, Alessia; Girolami, Francesca; Torricelli, Francesca; Brandi, Maria Luisa

    2010-01-01

    The emergence of next-generation sequencing (NGS) platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow. PMID:24710047

  4. Comparative structural studies on Lys49-phospholipases A(2) from Bothrops genus reveal their myotoxic site.

    PubMed

    dos Santos, Juliana I; Soares, Andreimar Martins; Fontes, Marcos R M

    2009-08-01

    Phospholipases A(2) (PLA(2)s) are membrane-associated enzymes that hydrolyze phospholipids at the sn-2 position, releasing lysophospholipids and free fatty acids. Phospholipase A(2) homologues (Lys49-PLA(2)s) are highly myotoxic and cause extensive tissue damage despite not showing measurable catalytic activity. They are found in different snake venoms and represent one third of bothropic venom composition. The importance of these toxins during envenomation is related to the pronounced local myotoxic effect they induce since this effect is not neutralized by serum therapy. We present herein three structures of Lys49-PLA(2)s from Bothrops genus snake venom crystallized under the same conditions, two of which were grown in the presence of alpha-tocopherol (vitamin E). Comparative structural analysis of these and other Lys49-PLA(2)s showed two different patterns of oligomeric conformation that are related to the presence or absence of ligands in the hydrophobic channel. This work also confirms the biological dimer indicated by recent studies in which both C-termini are in the dimeric interface. In this configuration, we propose that the myotoxic site of these toxins is composed by the Lys 20, Lys115 and Arg118 residues. For the first time, a residue from the short-helix (Lys20) is suggested as a member of this site and the importance of Tyr119 residue to myotoxicity of bothropic Lys49-PLA(2)s is also discussed. These results support a complete hypothesis for these PLA(2)s myotoxic activity consistent with all findings on bothropic Lys49-PLA(2)s studied up to this moment, including crystallographic, bioinformatics, biochemical and biophysical data. PMID:19401234

  5. Evolving Strategies for the Incorporation of Bioinformatics Within the Undergraduate Cell Biology Curriculum

    PubMed Central

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489

  6. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  7. Postgenomics: Proteomics and Bioinformatics in Cancer Research

    PubMed Central

    2003-01-01

    Now that the human genome is completed, the characterization of the proteins encoded by the sequence remains a challenging task. The study of the complete protein complement of the genome, the “proteome,” referred to as proteomics, will be essential if new therapeutic drugs and new disease biomarkers for early diagnosis are to be developed. Research efforts are already underway to develop the technology necessary to compare the specific protein profiles of diseased versus nondiseased states. These technologies provide a wealth of information and rapidly generate large quantities of data. Processing the large amounts of data will lead to useful predictive mathematical descriptions of biological systems which will permit rapid identification of novel therapeutic targets and identification of metabolic disorders. Here, we present an overview of the current status and future research approaches in defining the cancer cell's proteome in combination with different bioinformatics and computational biology tools toward a better understanding of health and disease. PMID:14615629

  8. Comparative genomics for understanding the structure, function and sub-cellular localization of hypothetical proteins in Thermanerovibrio acidaminovorans DSM 6589 (tai).

    PubMed

    Thakare, Hitesh S; Meshram, Dilip B; Jangam, Chandrakant M; Labhasetwar, Pawan; Roychoudhary, Kunal; Ingle, Arun B

    2016-04-01

    The Thermanerovibrio acidaminovorans DSM 6589 (tai) is a unique bacterium isolated from anaerobic sludge bed reactor from sugar refinery in Netherland. The comparative genomic studies for understanding the hypothetical proteins in T. acidaminovorans DSM 6589 (tai) were carried out using different bioinformatic tools and web servers. In all 320 hypothetical proteins were screened from the total available genome. The Insilico function prediction for 320 hypothetical proteins was achieved by using different online servers like CDD-Blast, Interproscan and pfam whereas, the structure prediction for 202 hypothetical proteins were deciphered by using protein structure prediction server (PS2 server). The sub-cellular localization for the identified proteins was predicted by the use of cello v2.5 for 320. The study carried out has helped us to understand the structures and functions of unknown proteins available in T. acidaminovorans DSM 6589 (tai) through comparative genomic approach. PMID:26930563

  9. Comparative testing of nondestructive examination techniques for concrete structures

    NASA Astrophysics Data System (ADS)

    Clayton, Dwight A.; Smith, Cyrus M.

    2014-03-01

    A multitude of concrete-based structures are typically part of a light water reactor (LWR) plant to provide foundation, support, shielding, and containment functions. Concrete has been used in the construction of nuclear power plants (NPPs) because of three primary properties, its inexpensiveness, its structural strength, and its ability to shield radiation. Examples of concrete structures important to the safety of LWR plants include containment building, spent fuel pool, and cooling towers. Comparative testing of the various NDE concrete measurement techniques requires concrete samples with known material properties, voids, internal microstructure flaws, and reinforcement locations. These samples can be artificially created under laboratory conditions where the various properties can be controlled. Other than NPPs, there are not many applications where critical concrete structures are as thick and reinforced. Therefore, there are not many industries other than the nuclear power plant or power plant industry that are interested in performing NDE on thick and reinforced concrete structures. This leads to the lack of readily available samples of thick and heavily reinforced concrete for performing NDE evaluations, research, and training. The industry that typically performs the most NDE on concrete structures is the bridge and roadway industry. While bridge and roadway structures are thinner and less reinforced, they have a good base of NDE research to support their field NDE programs to detect, identify, and repair concrete failures. This paper will summarize the initial comparative testing of two concrete samples with an emphasis on how these techniques could perform on NPP concrete structures.

  10. Structure and comparative morphology of camptotrichia of lungfish fins.

    PubMed

    Geraudie, J; Meunier, F J

    1984-01-01

    The present work is devoted to the organization and ultrastructure of the fin rays or camptotrichia of two living Dipnoi (lungfishes) Protopterus and Neoceratodus. In both species, these rods have a dual structure: only the superficial region facing the stratified epidermis is mineralized while the deep one is made of a dense unmineralized network of collagen fibrils forming a permanent pre-osseous tissue. Only the camptotrichia of Neoceratodus is made of cellular bone. This study confirms the structural peculiarities of these camptotrichia when compared to the dermal skeleton of the Actinopterygii constituted by the bony lepidotrichia and the actinotrichia. These results are discussed and compared to fossil dipnoan fin rays. PMID:6740649

  11. Structural Bioinformatics and Protein Docking Analysis of the Molecular Chaperone-Kinase Interactions: Towards Allosteric Inhibition of Protein Kinases by Targeting the Hsp90-Cdc37 Chaperone Machinery

    PubMed Central

    Lawless, Nathan; Blacklock, Kristin; Berrigan, Elizabeth; Verkhivker, Gennady

    2013-01-01

    A fundamental role of the Hsp90-Cdc37 chaperone system in mediating maturation of protein kinase clients and supporting kinase functional activity is essential for the integrity and viability of signaling pathways involved in cell cycle control and organism development. Despite significant advances in understanding structure and function of molecular chaperones, the molecular mechanisms and guiding principles of kinase recruitment to the chaperone system are lacking quantitative characterization. Structural and thermodynamic characterization of Hsp90-Cdc37 binding with protein kinase clients by modern experimental techniques is highly challenging, owing to a transient nature of chaperone-mediated interactions. In this work, we used experimentally-guided protein docking to probe the allosteric nature of the Hsp90-Cdc37 binding with the cyclin-dependent kinase 4 (Cdk4) kinase clients. The results of docking simulations suggest that the kinase recognition and recruitment to the chaperone system may be primarily determined by Cdc37 targeting of the N-terminal kinase lobe. The interactions of Hsp90 with the C-terminal kinase lobe may provide additional “molecular brakes” that can lock (or unlock) kinase from the system during client loading (release) stages. The results of this study support a central role of the Cdc37 chaperone in recognition and recruitment of the kinase clients. Structural analysis may have useful implications in developing strategies for allosteric inhibition of protein kinases by targeting the Hsp90-Cdc37 chaperone machinery. PMID:24287464

  12. Rapid Development of Bioinformatics Education in China

    ERIC Educational Resources Information Center

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  13. Biology in 'silico': The Bioinformatics Revolution.

    ERIC Educational Resources Information Center

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  14. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Cancer.gov

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  15. Fuzzy Logic in Medicine and Bioinformatics

    PubMed Central

    Torres, Angela; Nieto, Juan J.

    2006-01-01

    The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions) and in bioinformatics (comparison of genomes). PMID:16883057

  16. A Mathematical Optimization Problem in Bioinformatics

    ERIC Educational Resources Information Center

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  17. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  18. Modelling and bioinformatics analysis of the dimeric structure of house dust mite allergens from families 5 and 21: Der f 5 could dimerize as Der p 5.

    PubMed

    Khemili, Souad; Kwasigroch, Jean Marc; Hamadouche, Tarik; Gilis, Dimitri

    2012-01-01

    Allergy represents an increasing thread to public health in both developed and emerging countries and the dust mites Dermatophagoides pteronyssinus (Der p), Blomia tropicalis (Blo t), Dermatophagoides farinae (Der f), Lepidoglyphus destructor (Lep d) and Suidasia medanensis (Sui m) strongly contribute to this problem. Their allergens are classified in several families among which families 5 and 21 which are the subject of this work. Indeed, their biological function as well as the mechanism or epitopes by which they are contributing to the allergic response remain unknown and their tridimensional structures have not been resolved experimentally except for Blo t 5 and Der p 5. Blo t 5 is a monomeric three helical bundle, whereas Der p 5 shows a three helical bundle with a kinked N-terminal helix that assembles in an entangled dimeric structure with a large hydrophobic cavity. This cavity could be involved in the binding of hydrophobic ligands, which in turn could be responsible for the shift of the immune response from tolerance to allergic inflammation. We used molecular modelling approaches to bring out if other house dust mite allergens of families 5 and 21 (Der f 5, Sui m 5, Lep d 5, Der p 21 and Der f 21) could dimerize and form a large cavity in the same way as Der p 5. Monomeric models were first performed with MODELLER using the experimental structures of Der p 5 and Blo t 5 as templates. The ClusPro server processed the selected monomers in order to assess their capacity to form dimeric structures with a positive result for Der p 5 and Der f 5 only. The other allergens (Blo t 5, Sui m 5, Lep d 5, Der p 21 and Der f 21) did not present such a propensity. Moreover, we identified mutations that should destabilize and/or prevent the formation of the Der p 5 dimeric structure. The production of these mutated proteins could help us to understand the role of the dimerization process in the allergic response induced by Der p 5, and if Der p 5 and Der f 5 behave

  19. Technical phosphoproteomic and bioinformatic tools useful in cancer research.

    PubMed

    López, Elena; Wesselink, Jan-Jaap; López, Isabel; Mendieta, Jesús; Gómez-Puertas, Paulino; Muñoz, Sarbelio Rodríguez

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  20. Mathematics and evolutionary biology make bioinformatics education comprehensible

    PubMed Central

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  1. Technical phosphoproteomic and bioinformatic tools useful in cancer research

    PubMed Central

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  2. Bioinformatics clouds for big data manipulation

    PubMed Central

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475

  3. Design and bioinformatics analysis of genome-wide CLIP experiments

    PubMed Central

    Wang, Tao; Xiao, Guanghua; Chu, Yongjun; Zhang, Michael Q.; Corey, David R.; Xie, Yang

    2015-01-01

    The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP–RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses. PMID:25958398

  4. Accuracy of functional surfaces on comparatively modeled protein structures

    PubMed Central

    Zhao, Jieling; Dundas, Joe; Kachalo, Sema; Ouyang, Zheng; Liang, Jie

    2012-01-01

    Identification and characterization of protein functional surfaces are important for predicting protein function, understanding enzyme mechanism, and docking small compounds to proteins. As the rapid speed of accumulation of protein sequence information far exceeds that of structures, constructing accurate models of protein functional surfaces and identify their key elements become increasingly important. A promising approach is to build comparative models from sequences using known structural templates such as those obtained from structural genome projects. Here we assess how well this approach works in modeling binding surfaces. By systematically building three-dimensional comparative models of proteins using Modeller, we determine how well functional surfaces can be accurately reproduced. We use an alpha shape based pocket algorithm to compute all pockets on the modeled structures, and conduct a large-scale computation of similarity measurements (pocket RMSD and fraction of functional atoms captured) for 26,590 modeled enzyme protein structures. Overall, we find that when the sequence fragment of the binding surfaces has more than 45% identity to that of the tempalte protein, the modeled surfaces have on average an RMSD of 0.5 Å, and contain 48% or more of the binding surface atoms, with nearly all of the important atoms in the signatures of binding pockets captured. PMID:21541664

  5. Computational Biology and Bioinformatics in Nigeria

    PubMed Central

    Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-01-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310

  6. Bioinformatic challenges in targeted proteomics.

    PubMed

    Reker, Daniel; Malmström, Lars

    2012-09-01

    Selected reaction monitoring mass spectrometry is an emerging targeted proteomics technology that allows for the investigation of complex protein samples with high sensitivity and efficiency. It requires extensive knowledge about the sample for the many parameters needed to carry out the experiment to be set appropriately. Most studies today rely on parameter estimation from prior studies, public databases, or from measuring synthetic peptides. This is efficient and sound, but in absence of prior data, de novo parameter estimation is necessary. Computational methods can be used to create an automated framework to address this problem. However, the number of available applications is still small. This review aims at giving an orientation on the various bioinformatical challenges. To this end, we state the problems in classical machine learning and data mining terms, give examples of implemented solutions and provide some room for alternatives. This will hopefully lead to an increased momentum for the development of algorithms and serve the needs of the community for computational methods. We note that the combination of such methods in an assisted workflow will ease both the usage of targeted proteomics in experimental studies as well as the further development of computational approaches. PMID:22866949

  7. Identifiying human MHC supertypes using bioinformatic methods.

    PubMed

    Doytchinova, Irini A; Guan, Pingping; Flower, Darren R

    2004-04-01

    Classification of MHC molecules into supertypes in terms of peptide-binding specificities is an important issue, with direct implications for the development of epitope-based vaccines with wide population coverage. In view of extremely high MHC polymorphism (948 class I and 633 class II HLA alleles) the experimental solution of this task is presently impossible. In this study, we describe a bioinformatics strategy for classifying MHC molecules into supertypes using information drawn solely from three-dimensional protein structure. Two chemometric techniques-hierarchical clustering and principal component analysis-were used independently on a set of 783 HLA class I molecules to identify supertypes based on structural similarities and molecular interaction fields calculated for the peptide binding site. Eight supertypes were defined: A2, A3, A24, B7, B27, B44, C1, and C4. The two techniques gave 77% consensus, i.e., 605 HLA class I alleles were classified in the same supertype by both methods. The proposed strategy allowed "supertype fingerprints" to be identified. Thus, the A2 supertype fingerprint is Tyr(9)/Phe(9), Arg(97), and His(114) or Tyr(116); the A3-Tyr(9)/Phe(9)/Ser(9), Ile(97)/Met(97) and Glu(114) or Asp(116); the A24-Ser(9) and Met(97); the B7-Asn(63) and Leu(81); the B27-Glu(63) and Leu(81); for B44-Ala(81); the C1-Ser(77); and the C4-Asn(77). PMID:15034046

  8. PATRIC, the bacterial bioinformatics database and analysis resource

    PubMed Central

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  9. Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens

    PubMed Central

    Thomas, Reuben; Phuong, Jimmy; McHale, Cliona M.; Zhang, Luoping

    2012-01-01

    We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other. PMID:22851955

  10. Bioinformatics and the undergraduate curriculum essay.

    PubMed

    Maloney, Mark; Parker, Jeffrey; Leblanc, Mark; Woodard, Craig T; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of bioinformatics as a new discipline has challenged many colleges and universities to keep current with their curricula, often in the face of static or dwindling resources. On the plus side, many bioinformatics modules and related databases and software programs are free and accessible online, and interdisciplinary partnerships between existing faculty members and their support staff have proved advantageous in such efforts. We present examples of strategies and methods that have been successfully used to incorporate bioinformatics content into undergraduate curricula. PMID:20810947

  11. Bioinformatics in Italy: BITS2011, the Eighth Annual Meeting of the Italian Society of Bioinformatics

    PubMed Central

    2012-01-01

    The BITS2011 meeting, held in Pisa on June 20-22, 2011, brought together more than 120 Italian researchers working in the field of Bioinformatics, as well as students in Bioinformatics, Computational Biology, Biology, Computer Sciences, and Engineering, representing a landscape of Italian bioinformatics research. This preface provides a brief overview of the meeting and introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement. PMID:22536954

  12. Indentification and Analysis of Occludin Phosphosites: A Combined Mass Spectroscoy and Bioinformatics Approach

    SciTech Connect

    Sundstrom, J.; Tash, B; Murakami, T; Flanagan, J; Bewley, M; Stanley, B; Gonsar, K; Antonetti, D

    2009-01-01

    The molecular function of occludin, an integral membrane component of tight junctions, remains unclear. VEGF-induced phosphorylation sites were mapped on occludin by combining MS data analysis with bioinformatics. In vivo phosphorylation of Ser490 was validated and protein interaction studies combined with crystal structure analysis suggest that Ser490 phosphorylation attenuates the interaction between occludin and ZO-1. This study demonstrates that combining MS data and bioinformatics can successfully identify novel phosphorylation sites from limiting samples.

  13. Comparative study of medium damped and detuned linear accelerator structures

    SciTech Connect

    Jean-Francois Ostiguy et al.

    2001-08-22

    Long range wakefields are a serious concern for a future linear collider based on room temperature accelerating structures. They can be suppressed either by detuning and or local damping or with some combination of both strategies. Detuning relies on precisely phasing the contributions of the dipole modes excited by the passage of a single bunch. This is accomplished by controlling individual mode frequencies, a process which dictates individual cell dimensional tolerances. Each mode must be excited with the correct strength; this in turn, determines cell-to-cell alignment tolerances. In contrast, in a locally damped structure, the modes are attenuated at the cell level. Clearly, mode frequencies and relative excitation become less critical in that context; mechanical fabrication tolerances can be relaxed. While local damping is ideal from the stand-point of long range wakefield suppression, this comes at the cost of reducing the shunt impedance and possibly unacceptable localized heating. Recently, the Medium Damped Structure (MDS), a compromise between detuning and local damping, has generated some interest. In this paper, we compare a hypothetical MDS to the NLC Rounded Damped Detuned Structure (RDDS) and investigate possible advantages from the standpoint fabrication tolerances and their relation to beam stability and emittance preservation.

  14. A comparative structural study of wet and dried ettringite

    SciTech Connect

    Renaudin, G.; Filinchuk, Y.; Neubauer, J.; Goetz-Neunhoeffer, F.

    2010-03-15

    Two different techniques were used to compare structural characteristics of 'wet' ettringite (stored in the synthesis mother liquid) and 'dried' ettringite (dried to 35% relative humidity over saturated CaCl{sub 2} solution). Lattice parameters and the water content in the channel region of the structure (site occupancy factor of the water molecule not bonded to cations) as well as microstructure parameters (size and strain) were determined from a Rietveld refinement on synchrotron powder diffraction data. Local environment of sulphate anions and of the hydrogen bonding network was characterized by Raman spectroscopy. Both techniques led to the same conclusion: the 'wet' ettringite sample immersed in the mother solution from the synthesis presents similar structural features as ettringite dried to 35% relative humidity. An increase of the a lattice parameter combined with a decrease of the c lattice parameter occurs on drying. The amount of structural water, the point symmetry of sulphate and the hydrogen bond network are unchanged when passing from the wet to the dried ettringite powder. Ettringite does not form a high-hydrate polymorph in equilibrium with alkaline solution, in contrast to the AFm phases that lose water molecules on drying. According to these results we conclude that ettringite precipitated in aqueous solution at the early hydration stages is of the same chemical composition as ettringite present in the hardening concrete.

  15. Comparing molecules and solids across structural and alchemical space.

    PubMed

    De, Sandip; Bartók, Albert P; Csányi, Gábor; Ceriotti, Michele

    2016-05-18

    Evaluating the (dis)similarity of crystalline, disordered and molecular compounds is a critical step in the development of algorithms to navigate automatically the configuration space of complex materials. For instance, a structural similarity metric is crucial for classifying structures, searching chemical space for better compounds and materials, and driving the next generation of machine-learning techniques for predicting the stability and properties of molecules and materials. In the last few years several strategies have been designed to compare atomic coordination environments. In particular, the smooth overlap of atomic positions (SOAPs) has emerged as an elegant framework to obtain translation, rotation and permutation-invariant descriptors of groups of atoms, underlying the development of various classes of machine-learned inter-atomic potentials. Here we discuss how one can combine such local descriptors using a regularized entropy match (REMatch) approach to describe the similarity of both whole molecular and bulk periodic structures, introducing powerful metrics that enable the navigation of alchemical and structural complexities within a unified framework. Furthermore, using this kernel and a ridge regression method we can predict atomization energies for a database of small organic molecules with a mean absolute error below 1 kcal mol(-1), reaching an important milestone in the application of machine-learning techniques for the evaluation of molecular properties. PMID:27101873

  16. Comparative population structure of cavity-nesting sea ducks

    USGS Publications Warehouse

    Pearce, John M.; Eadie, John M.; Savard, Jean-Pierre L.; Christensen, Thomas K.; Berdeen, James; Taylor, Eric J.; Boyd, Sean; Einarsson, Árni

    2014-01-01

    A growing collection of mtDNA genetic information from waterfowl species across North America suggests that larger-bodied cavity-nesting species exhibit greater levels of population differentiation than smaller-bodied congeners. Although little is known about nest-cavity availability for these species, one hypothesis to explain differences in population structure is reduced dispersal tendency of larger-bodied cavity-nesting species due to limited abundance of large cavities. To investigate this hypothesis, we examined population structure of three cavity-nesting waterfowl species distributed across much of North America: Barrow's Goldeneye (Bucephala islandica), Common Goldeneye (B. clangula), and Bufflehead (B. albeola). We compared patterns of population structure using both variation in mtDNA control-region sequences and band-recovery data for the same species and geographic regions. Results were highly congruent between data types, showing structured population patterns for Barrow's and Common Goldeneye but not for Bufflehead. Consistent with our prediction, the smallest cavity-nesting species, the Bufflehead, exhibited the lowest level of population differentiation due to increased dispersal and gene flow. Results provide evidence for discrete Old and New World populations of Common Goldeneye and for differentiation of regional groups of both goldeneye species in Alaska, the Pacific Northwest, and the eastern coast of North America. Results presented here will aid management objectives that require an understanding of population delineation and migratory connectivity between breeding and wintering areas. Comparative studies such as this one highlight factors that may drive patterns of genetic diversity and population trends.

  17. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    NASA Astrophysics Data System (ADS)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  18. CAPweb: a bioinformatics CGH array Analysis Platform.

    PubMed

    Liva, Stéphane; Hupé, Philippe; Neuvial, Pierre; Brito, Isabel; Viara, Eric; La Rosa, Philippe; Barillot, Emmanuel

    2006-07-01

    Assessing variations in DNA copy number is crucial for understanding constitutional or somatic diseases, particularly cancers. The recently developed array-CGH (comparative genomic hybridization) technology allows this to be investigated at the genomic level. We report the availability of a web tool for analysing array-CGH data. CAPweb (CGH array Analysis Platform on the Web) is intended as a user-friendly tool enabling biologists to completely analyse CGH arrays from the raw data to the visualization and biological interpretation. The user typically performs the following bioinformatics steps of a CGH array project within CAPweb: the secure upload of the results of CGH array image analysis and of the array annotation (genomic position of the probes); first level analysis of each array, including automatic normalization of the data (for correcting experimental biases), breakpoint detection and status assignment (gain, loss or normal); validation or deletion of the analysis based on a summary report and quality criteria; visualization and biological analysis of the genomic profiles and results through a user-friendly interface. CAPweb is accessible at http://bioinfo.curie.fr/CAPweb. PMID:16845053

  19. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    PubMed

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China. PMID:26351162

  20. High-throughput protein analysis integrating bioinformatics and experimental assays.

    PubMed

    del Val, Coral; Mehrle, Alexander; Falkenhahn, Mechthild; Seiler, Markus; Glatting, Karl-Heinz; Poustka, Annemarie; Suhai, Sandor; Wiemann, Stefan

    2004-01-01

    The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins. PMID:14762202

  1. Bioinformatics process management: information flow via a computational journal

    PubMed Central

    Feagan, Lance; Rohrer, Justin; Garrett, Alexander; Amthauer, Heather; Komp, Ed; Johnson, David; Hock, Adam; Clark, Terry; Lushington, Gerald; Minden, Gary; Frost, Victor

    2007-01-01

    This paper presents the Bioinformatics Computational Journal (BCJ), a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples. PMID:18053179

  2. Carving a niche: establishing bioinformatics collaborations

    PubMed Central

    Lyon, Jennifer A.; Tennant, Michele R.; Messner, Kevin R.; Osterbur, David L.

    2006-01-01

    Objectives: The paper describes collaborations and partnerships developed between library bioinformatics programs and other bioinformatics-related units at four academic institutions. Methods: A call for information on bioinformatics partnerships was made via email to librarians who have participated in the National Center for Biotechnology Information's Advanced Workshop for Bioinformatics Information Specialists. Librarians from Harvard University, the University of Florida, the University of Minnesota, and Vanderbilt University responded and expressed willingness to contribute information on their institutions, programs, services, and collaborating partners. Similarities and differences in programs and collaborations were identified. Results: The four librarians have developed partnerships with other units on their campuses that can be categorized into the following areas: knowledge management, instruction, and electronic resource support. All primarily support freely accessible electronic resources, while other campus units deal with fee-based ones. These demarcations are apparent in resource provision as well as in subsequent support and instruction. Conclusions and Recommendations: Through environmental scanning and networking with colleagues, librarians who provide bioinformatics support can develop fruitful collaborations. Visibility is key to building collaborations, as is broad-based thinking in terms of potential partners. PMID:16888668

  3. Comparative sequence-structure analysis of Aves insulin

    PubMed Central

    Islam, Md Mirazul; Aktaruzzaman, M; Mohamed, Zahurin

    2015-01-01

    Normal blood glucose level depends on the availability of insulin and its ability to bind insulin receptor (IR) that regulates the downstream signaling pathway. Insulin sequence and blood glucose level usually vary among animals due to species specificity. The study of genetic variation of insulin, blood glucose level and diabetics symptoms development in Aves is interesting because of its optimal high blood glucose level than mammals. Therefore, it is of interest to study its evolutionary relationship with other mammals using sequence data. Hence, we compiled 32 Aves insulin from GenBank to compare its sequence-structure features with phylogeny for evolutionary inference. The analysis shows long conserved motifs (about 14 residues) for functional inference. These sequences show high leucine content (20%) with high instability index (>40). Amino acid position 11, 14, 16 and 20 are variable that may have contribution to binding to IR. We identified functionally critical variable residues in the dataset for possible genetic implication. Structural models of these sequences were developed for surface analysis towards functional representation. These data find application in the understanding of insulin function across species. PMID:25848166

  4. Bioinformatics Approaches to Classifying Allergens and Predicting Cross-Reactivity

    PubMed Central

    Schein, Catherine H.; Ivanciuc, Ovidiu; Braun, Werner

    2007-01-01

    The major advances in understanding why patients respond to several seemingly different stimuli have been through the isolation, sequencing and structural analysis of proteins that induce an IgE response. The most significant finding is that allergenic proteins from very different sources can have nearly identical sequences and structures, and that this similarity can account for clinically observed cross-reactivity. The increasing amount of information on the sequence, structure and IgE epitopes of allergens is now available in several databases and powerful bioinformatics search tools allow user access to relevant information. Here, we provide an overview of these databases and describe state-of-the art bioinformatics tools to identify the common proteins that may be at the root of multiple allergy syndromes. Progress has also been made in quantitatively defining characteristics that discriminate allergens from non-allergens. Search and software tools for this purpose have been developed and implemented in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/). SDAP contains information for over 800 allergens and extensive bibliographic references in a relational database with links to other publicly available databases. SDAP is freely available on the Web to clinicians and patients, and can be used to find structural and functional relations among known allergens and to identify potentially cross-reacting antigens. Here we illustrate how these bioinformatics tools can be used to group allergens, and to detect areas that may account for common patterns of IgE binding and cross-reactivity. Such results can be used to guide treatment regimens for allergy sufferers. PMID:17276876

  5. Non-structural carbohydrates in woody plants compared among laboratories.

    PubMed

    Quentin, Audrey G; Pinkard, Elizabeth A; Ryan, Michael G; Tissue, David T; Baggett, L Scott; Adams, Henry D; Maillard, Pascale; Marchand, Jacqueline; Landhäusser, Simon M; Lacointe, André; Gibon, Yves; Anderegg, William R L; Asao, Shinichi; Atkin, Owen K; Bonhomme, Marc; Claye, Caroline; Chow, Pak S; Clément-Vidal, Anne; Davies, Noel W; Dickman, L Turin; Dumbur, Rita; Ellsworth, David S; Falk, Kristen; Galiano, Lucía; Grünzweig, José M; Hartmann, Henrik; Hoch, Günter; Hood, Sharon; Jones, Joanna E; Koike, Takayoshi; Kuhlmann, Iris; Lloret, Francisco; Maestro, Melchor; Mansfield, Shawn D; Martínez-Vilalta, Jordi; Maucourt, Mickael; McDowell, Nathan G; Moing, Annick; Muller, Bertrand; Nebauer, Sergio G; Niinemets, Ülo; Palacio, Sara; Piper, Frida; Raveh, Eran; Richter, Andreas; Rolland, Gaëlle; Rosas, Teresa; Saint Joanis, Brigitte; Sala, Anna; Smith, Renee A; Sterck, Frank; Stinziano, Joseph R; Tobias, Mari; Unda, Faride; Watanabe, Makoto; Way, Danielle A; Weerasinghe, Lasantha K; Wild, Birgit; Wiley, Erin; Woodruff, David R

    2015-11-01

    Non-structural carbohydrates (NSC) in plant tissue are frequently quantified to make inferences about plant responses to environmental conditions. Laboratories publishing estimates of NSC of woody plants use many different methods to evaluate NSC. We asked whether NSC estimates in the recent literature could be quantitatively compared among studies. We also asked whether any differences among laboratories were related to the extraction and quantification methods used to determine starch and sugar concentrations. These questions were addressed by sending sub-samples collected from five woody plant tissues, which varied in NSC content and chemical composition, to 29 laboratories. Each laboratory analyzed the samples with their laboratory-specific protocols, based on recent publications, to determine concentrations of soluble sugars, starch and their sum, total NSC. Laboratory estimates differed substantially for all samples. For example, estimates for Eucalyptus globulus leaves (EGL) varied from 23 to 116 (mean = 56) mg g(-1) for soluble sugars, 6-533 (mean = 94) mg g(-1) for starch and 53-649 (mean = 153) mg g(-1) for total NSC. Mixed model analysis of variance showed that much of the variability among laboratories was unrelated to the categories we used for extraction and quantification methods (method category R(2) = 0.05-0.12 for soluble sugars, 0.10-0.33 for starch and 0.01-0.09 for total NSC). For EGL, the difference between the highest and lowest least squares means for categories in the mixed model analysis was 33 mg g(-1) for total NSC, compared with the range of laboratory estimates of 596 mg g(-1). Laboratories were reasonably consistent in their ranks of estimates among tissues for starch (r = 0.41-0.91), but less so for total NSC (r = 0.45-0.84) and soluble sugars (r = 0.11-0.83). Our results show that NSC estimates for woody plant tissues cannot be compared among laboratories. The relative changes in NSC between treatments measured within a laboratory

  6. BioZone Exploting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    SciTech Connect

    Liu, L; Buttler, D; Paques, H; Pu, C; Critchlow

    2002-01-28

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  7. BioZoom: Exploiting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    SciTech Connect

    Liu, L; Buttler, D; Critchlow, T J; Han, W; Paques, H; Pu, C; Rocco, D

    2003-01-09

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  8. BioShaDock: a community driven bioinformatics shared Docker-based tools registry

    PubMed Central

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  9. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    PubMed

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  10. Structure, function and evolution of the gas exchangers: comparative perspectives

    PubMed Central

    Maina, JN

    2002-01-01

    Over the evolutionary continuum, animals have faced similar fundamental challenges of acquiring molecular oxygen for aerobic metabolism. Under limitations and constraints imposed by factors such as phylogeny, behaviour, body size and environment, they have responded differently in founding optimal respiratory structures. A quintessence of the aphorism that ‘necessity is the mother of invention’, gas exchangers have been inaugurated through stiff cost–benefit analyses that have evoked transaction of trade-offs and compromises. Cogent structural–functional correlations occur in constructions of gas exchangers: within and between taxa, morphological complexity and respiratory efficiency increase with metabolic capacities and oxygen needs. Highly active, small endotherms have relatively better-refined gas exchangers compared with large, inactive ectotherms. Respiratory structures have developed from the plain cell membrane of the primeval prokaryotic unicells to complex multifunctional ones ofthe modern Metazoa. Regarding the respiratory medium used to extract oxygen from, animal life has had only two choices – water or air – within the biological range of temperature and pressure the only naturally occurring respirable fluids. In rarer cases, certain animalshave adapted to using both media. Gills (evaginated gas exchangers) are the primordial respiratory organs: they are the archetypal water breathing organs. Lungs (invaginated gas exchangers) are the model air breathing organs. Bimodal (transitional) breathers occupy the water–air interface. Presentation and exposure of external (water/air) and internal (haemolymph/blood) respiratory media, features determined by geometric arrangement of the conduits, are important features for gas exchange efficiency: counter-current, cross-current, uniform pool and infinite pool designs have variably developed. PMID:12430953

  11. Bioinformatics Approaches for Predicting Disordered Protein Motifs.

    PubMed

    Bhowmick, Pallab; Guharoy, Mainak; Tompa, Peter

    2015-01-01

    Short, linear motifs (SLiMs) in proteins are functional microdomains consisting of contiguous residue segments along the protein sequence, typically not more than 10 consecutive amino acids in length with less than 5 defined positions. Many positions are 'degenerate' thus offering flexibility in terms of the amino acid types allowed at those positions. Their short length and degenerate nature confers evolutionary plasticity meaning that SLiMs often evolve convergently. Further, SLiMs have a propensity to occur within intrinsically unstructured protein segments and this confers versatile functionality to unstructured regions of the proteome. SLiMs mediate multiple types of protein interactions based on domain-peptide recognition and guide functions including posttranslational modifications, subcellular localization of proteins, and ligand binding. SLiMs thus behave as modular interaction units that confer versatility to protein function and SLiM-mediated interactions are increasingly being recognized as therapeutic targets. In this chapter we start with a brief description about the properties of SLiMs and their interactions and then move on to discuss algorithms and tools including several web-based methods that enable the discovery of novel SLiMs (de novo motif discovery) as well as the prediction of novel occurrences of known SLiMs. Both individual amino acid sequences as well as sets of protein sequences can be scanned using these methods to obtain statistically overrepresented sequence patterns. Lists of putatively functional SLiMs are then assembled based on parameters such as evolutionary sequence conservation, disorder scores, structural data, gene ontology terms and other contextual information that helps to assess the functional credibility or significance of these motifs. These bioinformatics methods should certainly guide experiments aimed at motif discovery. PMID:26387106

  12. Study on the Response Coefficient of Setback Structures Compared to Regular Moment Frame Structures

    SciTech Connect

    Mirghaderi, S. Rasoul; Khafaf, Bardia; Epackachi, Siamak

    2008-07-08

    In design practice of many countries, seismic analysis and proportioning of structures are usually based upon linear elastic analysis due to reduced seismic forces by response coefficient; R. Setback structures are one of the most popular shapes of the constructed buildings. In setback structures, the shape and proportions of the building have a major effect on distribution of earthquake forces as they work their way through the building. On the other hand, geometric configuration has a profound effect on the structural-dynamic response of a building. Therefore, when a building has irregular features, such as asymmetric in height or vertical discontinuity, the traditional assumptions used in development of seismic criteria for regular buildings may not be applicable. Inelastic seismic behavior of these types of structures seems to be quite different from the regular steel moment resisting structures in which the overall ductility is localized at beam-ends.In order to investigate the seismic behavior and estimate the Response Coefficient of those structures, nonlinear static analysis (pushover) are used for three categories of setback structures namely low rise, medium rise and high rise buildings with different setbacks in their height. The Response Coefficient are calculated and compared with those taken from regular type of moment frame structures.

  13. Implementing bioinformatic workflows within the bioextract server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  14. Bioinformatics in Undergraduate Education: Practical Examples

    ERIC Educational Resources Information Center

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  15. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    EPA Science Inventory

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  16. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  17. Bioinformatics: A History of Evolution "In Silico"

    ERIC Educational Resources Information Center

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  18. "Extreme Programming" in a Bioinformatics Class

    ERIC Educational Resources Information Center

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  19. 2010 Translational bioinformatics year in review

    PubMed Central

    Miller, Katharine S

    2011-01-01

    A review of 2010 research in translational bioinformatics provides much to marvel at. We have seen notable advances in personal genomics, pharmacogenetics, and sequencing. At the same time, the infrastructure for the field has burgeoned. While acknowledging that, according to researchers, the members of this field tend to be overly optimistic, the authors predict a bright future. PMID:21672905

  20. KDE Bioscience: platform for bioinformatics analysis workflows.

    PubMed

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research. PMID:16260186

  1. Applications of Support Vector Machines In Chemo And Bioinformatics

    NASA Astrophysics Data System (ADS)

    Jayaraman, V. K.; Sundararajan, V.

    2010-10-01

    Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.

  2. Navigating the changing learning landscape: perspective from bioinformatics.ca

    PubMed Central

    Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  3. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    PubMed

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  4. A novel approach to represent and compare RNA secondary structures

    PubMed Central

    Mattei, Eugenio; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela

    2014-01-01

    Structural information is crucial in ribonucleic acid (RNA) analysis and functional annotation; nevertheless, how to include such structural data is still a debated problem. Dot-bracket notation is the most common and simple representation for RNA secondary structures but its simplicity leads also to ambiguity requiring further processing steps to dissolve. Here we present BEAR (Brand nEw Alphabet for RNA), a new context-aware structural encoding represented by a string of characters. Each character in BEAR encodes for a specific secondary structure element (loop, stem, bulge and internal loop) with specific length. Furthermore, exploiting this informative and yet simple encoding in multiple alignments of related RNAs, we captured how much structural variation is tolerated in RNA families and convert it into transition rates among secondary structure elements. This allowed us to compute a substitution matrix for secondary structure elements called MBR (Matrix of BEAR-encoded RNA secondary structures), of which we tested the ability in aligning RNA secondary structures. We propose BEAR and the MBR as powerful resources for the RNA secondary structure analysis, comparison and classification, motif finding and phylogeny. PMID:24753415

  5. Agile parallel bioinformatics workflow management using Pwrake

    PubMed Central

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  6. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects. PMID:26351170

  7. Component-Based Approach for Educating Students in Bioinformatics

    ERIC Educational Resources Information Center

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  8. Combining Bioinformatics and Phylogenetics to Identify Large Sets of Single-Copy Orthologous Genes (COSII) for Comparative, Evolutionary and Systematic Studies: A Test Case in the Euasterid Plant Clade

    PubMed Central

    Wu, Feinan; Mueller, Lukas A.; Crouzillat, Dominique; Pétiard, Vincent; Tanksley, Steven D.

    2006-01-01

    We report herein the application of a set of algorithms to identify a large number (2869) of single-copy orthologs (COSII), which are shared by most, if not all, euasterid plant species as well as the model species Arabidopsis. Alignments of the orthologous sequences across multiple species enabled the design of “universal PCR primers,” which can be used to amplify the corresponding orthologs from a broad range of taxa, including those lacking any sequence databases. Functional annotation revealed that these conserved, single-copy orthologs encode a higher-than-expected frequency of proteins transported and utilized in organelles and a paucity of proteins associated with cell walls, protein kinases, transcription factors, and signal transduction. The enabling power of this new ortholog resource was demonstrated in phylogenetic studies, as well as in comparative mapping across the plant families tomato (family Solanaceae) and coffee (family Rubiaceae). The combined results of these studies provide compelling evidence that (1) the ancestral species that gave rise to the core euasterid families Solanaceae and Rubiaceae had a basic chromosome number of x = 11 or 12.2) No whole-genome duplication event (i.e., polyploidization) occurred immediately prior to or after the radiation of either Solanaceae or Rubiaceae as has been recently suggested. PMID:16951058

  9. Quantifying variances in comparative RNA secondary structure prediction

    PubMed Central

    2013-01-01

    Background With the advancement of next-generation sequencing and transcriptomics technologies, regulatory effects involving RNA, in particular RNA structural changes are being detected. These results often rely on RNA secondary structure predictions. However, current approaches to RNA secondary structure modelling produce predictions with a high variance in predictive accuracy, and we have little quantifiable knowledge about the reasons for these variances. Results In this paper we explore a number of factors which can contribute to poor RNA secondary structure prediction quality. We establish a quantified relationship between alignment quality and loss of accuracy. Furthermore, we define two new measures to quantify uncertainty in alignment-based structure predictions. One of the measures improves on the “reliability score” reported by PPfold, and considers alignment uncertainty as well as base-pair probabilities. The other measure considers the information entropy for SCFGs over a space of input alignments. Conclusions Our predictive accuracy improves on the PPfold reliability score. We can successfully characterize many of the underlying reasons for and variances in poor prediction. However, there is still variability unaccounted for, which we therefore suggest comes from the RNA secondary structure predictive model itself. PMID:23634662

  10. Can bioinformatics help in the identification of moonlighting proteins?

    PubMed

    Hernández, Sergio; Calvo, Alejandra; Ferragut, Gabriela; Franco, Luís; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2014-12-01

    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place. PMID:25399591

  11. Bioinformatic characterization of plant networks

    SciTech Connect

    McDermott, Jason E.; Samudrala, Ram

    2008-06-30

    Cells and organisms are governed by networks of interactions, genetic, physical and metabolic. Large-scale experimental studies of interactions between components of biological systems have been performed for a variety of eukaryotic organisms. However, there is a dearth of such data for plants. Computational methods for prediction of relationships between proteins, primarily based on comparative genomics, provide a useful systems-level view of cellular functioning and can be used to extend information about other eukaryotes to plants. We have predicted networks for Arabidopsis thaliana, Oryza sativa indica and japonica and several plant pathogens using the Bioverse (http://bioverse.compbio.washington.edu) and show that they are similar to experimentally-derived interaction networks. Predicted interaction networks for plants can be used to provide novel functional annotations and predictions about plant phenotypes and aid in rational engineering of biosynthesis pathways.

  12. A toolbox for developing bioinformatics software

    PubMed Central

    Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.

    2012-01-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  13. A toolbox for developing bioinformatics software.

    PubMed

    Rother, Kristian; Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M

    2012-03-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  14. Novel bioinformatic developments for exome sequencing.

    PubMed

    Lelieveld, Stefan H; Veltman, Joris A; Gilissen, Christian

    2016-06-01

    With the widespread adoption of next generation sequencing technologies by the genetics community and the rapid decrease in costs per base, exome sequencing has become a standard within the repertoire of genetic experiments for both research and diagnostics. Although bioinformatics now offers standard solutions for the analysis of exome sequencing data, many challenges still remain; especially the increasing scale at which exome data are now being generated has given rise to novel challenges in how to efficiently store, analyze and interpret exome data of this magnitude. In this review we discuss some of the recent developments in bioinformatics for exome sequencing and the directions that this is taking us to. With these developments, exome sequencing is paving the way for the next big challenge, the application of whole genome sequencing. PMID:27075447

  15. Translational bioinformatics applications in genome medicine

    PubMed Central

    2009-01-01

    Although investigators using methodologies in bioinformatics have always been useful in genomic experimentation in analytic, engineering, and infrastructure support roles, only recently have bioinformaticians been able to have a primary scientific role in asking and answering questions on human health and disease. Here, I argue that this shift in role towards asking questions in medicine is now the next step needed for the field of bioinformatics. I outline four reasons why bioinformaticians are newly enabled to drive the questions in primary medical discovery: public availability of data, intersection of data across experiments, commoditization of methods, and streamlined validation. I also list four recommendations for bioinformaticians wishing to get more involved in translational research. PMID:19566916

  16. Bioinformatics in New Generation Flavivirus Vaccines

    PubMed Central

    Koraka, Penelope; Martina, Byron E. E.; Osterhaus, Albert D. M. E.

    2010-01-01

    Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed. PMID:20467477

  17. Discovery and Classification of Bioinformatics Web Services

    SciTech Connect

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  18. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure

    NASA Astrophysics Data System (ADS)

    Lima, Thamires A.; Paschoal, Vitor H.; Faria, Luiz F. O.; Ribeiro, Mauro C. C.; Giles, Carlos

    2016-06-01

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114]+ and [N1444]+ proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444]+ as to [N1114]+ because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114]+ cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids.

  19. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure.

    PubMed

    Lima, Thamires A; Paschoal, Vitor H; Faria, Luiz F O; Ribeiro, Mauro C C; Giles, Carlos

    2016-06-14

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114](+) and [N1444](+) proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444](+) as to [N1114](+) because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114](+) cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids. PMID:27306015

  20. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  1. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  2. Why Polyphenols have Promiscuous Actions? An Investigation by Chemical Bioinformatics.

    PubMed

    Tang, Guang-Yan

    2016-05-01

    Despite their diverse pharmacological effects, polyphenols are poor for use as drugs, which have been traditionally ascribed to their low bioavailability. However, Baell and co-workers recently proposed that the redox potential of polyphenols also plays an important role in this, because redox reactions bring promiscuous actions on various protein targets and thus produce non-specific pharmacological effects. To investigate whether the redox reactivity behaves as a critical factor in polyphenol promiscuity, we performed a chemical bioinformatics analysis on the structure-activity relationships of twenty polyphenols. It was found that the gene expression profiles of human cell lines induced by polyphenols were not correlated with the presence or not of redox moieties in the polyphenols, but significantly correlated with their molecular structures. Therefore, it is concluded that the promiscuous actions of polyphenols are likely to result from their inherent structural features rather than their redox potential. PMID:27319142

  3. How do disordered regions achieve comparable functions to structured domains?

    PubMed Central

    Latysheva, Natasha S; Flock, Tilman; Weatheritt, Robert J; Chavali, Sreenivas; Babu, M Madan

    2015-01-01

    The traditional structure to function paradigm conceives of a protein's function as emerging from its structure. In recent years, it has been established that unstructured, intrinsically disordered regions (IDRs) in proteins are equally crucial elements for protein function, regulation and homeostasis. In this review, we provide a brief overview of how IDRs can perform similar functions to structured proteins, focusing especially on the formation of protein complexes and assemblies and the mediation of regulated conformational changes. In addition to highlighting instances of such functional equivalence, we explain how differences in the biological and physicochemical properties of IDRs allow them to expand the functional and regulatory repertoire of proteins. We also discuss studies that provide insights into how mutations within functional regions of IDRs can lead to human diseases. PMID:25752799

  4. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    PubMed

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease. PMID:22933157

  5. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing. PMID:27217183

  6. Bioinformatics on the Cloud Computing Platform Azure

    PubMed Central

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  7. Bioinformatics on the cloud computing platform Azure.

    PubMed

    Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  8. Application of Bioinformatics in Chronobiology Research

    PubMed Central

    Lopes, Robson da Silva; Resende, Nathalia Maria; Honorio-França, Adenilda Cristina; França, Eduardo Luzía

    2013-01-01

    Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research. PMID:24187519

  9. Comparative Effectiveness of Contextual and Structural Method of Teaching Vocabulary

    ERIC Educational Resources Information Center

    Behlol, Malik; Kaini, Mohammad Munir

    2011-01-01

    The study was conducted to find out effectiveness of contextual an, structural method of teaching vocabulary in English at secondary level. It was an experimental study in which the pretest posttest design was used. The population of the study was the students of secondary classes studying in Government secondary schools of Rawalpindi District.…

  10. The Structure of Women's Employment in Comparative Perspective

    ERIC Educational Resources Information Center

    Pettit, Becky; Hook, Jennifer Lynn

    2005-01-01

    In this paper we analyze social survey data from 19 countries using multi-level modeling methods in an effort to synthesize structural and institutional accounts for variation in women's employment. Observed demographic characteristics show much consistency in their relationship to women's employment across countries, yet there is significant…

  11. Comparative structural biology of eubacterial and archaeal oligosaccharyltransferases.

    PubMed

    Maita, Nobuo; Nyirenda, James; Igura, Mayumi; Kamishikiryo, Jun; Kohda, Daisuke

    2010-02-12

    Oligosaccharyltransferase (OST) catalyzes the transfer of an oligosaccharide from a lipid donor to an asparagine residue in nascent polypeptide chains. In the bacterium Campylobacter jejuni, a single-subunit membrane protein, PglB, catalyzes N-glycosylation. We report the 2.8 A resolution crystal structure of the C-terminal globular domain of PglB and its comparison with the previously determined structure from the archaeon Pyrococcus AglB. The two distantly related oligosaccharyltransferases share unexpected structural similarity beyond that expected from the sequence comparison. The common architecture of the putative catalytic sites revealed a new catalytic motif in PglB. Site-directed mutagenesis analyses confirmed the contribution of this motif to the catalytic function. Bacterial PglB and archaeal AglB constitute a protein family of the catalytic subunit of OST along with STT3 from eukaryotes. A structure-aided multiple sequence alignment of the STT3/PglB/AglB protein family revealed three types of OST catalytic centers. This novel classification will provide a useful framework for understanding the enzymatic properties of the OST enzymes from Eukarya, Archaea, and Bacteria. PMID:20007322

  12. Comparative static curing versus dynamic curing on tablet coating structures.

    PubMed

    Gendre, Claire; Genty, Muriel; Fayard, Barbara; Tfayli, Ali; Boiret, Mathieu; Lecoq, Olivier; Baron, Michel; Chaminade, Pierre; Péan, Jean Manuel

    2013-09-10

    Curing is generally required to stabilize film coating from aqueous polymer dispersion. This post-coating drying step is traditionally carried out in static conditions, requiring the transfer of solid dosage forms to an oven. But, curing operation performed directly inside the coating equipment stands for an attractive industrial application. Recently, the use of various advanced physico-chemical characterization techniques i.e., X-ray micro-computed tomography, vibrational spectroscopies (near infrared and Raman) and X-ray microdiffraction, allowed new insights into the film-coating structures of dynamically cured tablets. Dynamic curing end-point was efficiently determined after 4h. The aim of the present work was to elucidate the influence of curing conditions on film-coating structures. Results demonstrated that 24h of static curing and 4h of dynamic curing, both performed at 60°C and ambient relative humidity, led to similar coating layers in terms of drug release properties, porosity, water content, structural rearrangement of polymer chains and crystalline distribution. Furthermore, X-ray microdiffraction measurements pointed out different crystalline coating compositions depending on sample storage time. An aging mechanism might have occur during storage, resulting in the crystallization and the upward migration of cetyl alcohol, coupled to the downward migration of crystalline sodium lauryl sulfate within the coating layer. Interestingly, this new study clearly provided further knowledge into film-coating structures after a curing step and confirmed that curing operation could be performed in dynamic conditions. PMID:23792043

  13. Structural and Social Psychological Correlates of Prisonization: A Comparative Analysis.

    ERIC Educational Resources Information Center

    Thomas, Charles W.; And Others

    This study considers some aspects of "prisonization," or the process by which inmates adapt to confinement. Specifically, it further examines two ideas suggested by earlier studies. One is the belief that the structural characteristics of many prisons promote rather than inhibit assimilation into an inmate normative system that is opposed to the…

  14. Bioinformatics for transporter pharmacogenomics and systems biology: data integration and modeling with UML.

    PubMed

    Yan, Qing

    2010-01-01

    Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields. PMID:20419428

  15. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology

    PubMed Central

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-01-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology. PMID:26753026

  16. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology.

    PubMed

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-12-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology. PMID:26753026

  17. Quantum Bio-Informatics IV

    NASA Astrophysics Data System (ADS)

    Accardi, Luigi; Freudenberg, Wolfgang; Ohya, Masanori

    2011-01-01

    .Use of cryptographic ideas to interpret biological phenomena (and vice versa) / M. Regoli -- Discrete approximation to operators in white noise analysis / Si Si -- Bogoliubov type equations via infinite-dimensional equations for measures / V. V. Kozlov and O. G. Smolyanov -- Analysis of several categorical data using measure of proportional reduction in variation / K. Yamamoto ... [et al.] -- The electron reservoir hypothesis for two-dimensional electron systems / K. Yamada ... [et al.] -- On the correspondence between Newtonian and functional mechanics / E. V. Piskovskiy and I. V. Volovich -- Quantile-quantile plots: An approach for the inter-species comparison of promoter architecture in eukaryotes / K. Feldmeier ... [et al.] -- Entropy type complexities in quantum dynamical processes / N. Watanabe -- A fair sampling test for Ekert protocol / G. Adenier, A. Yu. Khrennikov and N. Watanabe -- Brownian dynamics simulation of macromolecule diffusion in a protocell / T. Ando and J. Skolnick -- Signaling network of environmental sensing and adaptation in plants: Key roles of calcium ion / K. Kuchitsu and T. Kurusu -- NetzCope: A tool for displaying and analyzing complex networks / M. J. Barber, L. Streit and O. Strogan -- Study of HIV-1 evolution by coding theory and entropic chaos degree / K. Sato -- The prediction of botulinum toxin structure based on in silico and in vitro analysis / T. Suzuki and S. Miyazaki -- On the mechanism of D-wave high T[symbol] superconductivity by the interplay of Jahn-Teller physics and Mott physics / H. Ushio, S. Matsuno and H. Kamimura.

  18. Entropyology: the application of bioinformatics and data modeling to digital virus and malware recognition

    NASA Astrophysics Data System (ADS)

    Jaenisch, Holger M.; Handley, James W.

    2010-04-01

    Malware are analogs of viruses. Viruses are comprised of large numbers of polypeptide proteins. The shape and function of the protein strands determines the functionality of the segment, similar to a subroutine in malware. The full combination of subroutines is the malware organism, in analogous fashion as a collection of polypeptides forms protein structures that are information bearing. We propose to apply the methods of Bioinformatics to analyze malware to provide a rich feature set for creating a unique and novel detection and classification scheme that is originally applied to Bioinformatics amino acid sequencing. Our proposed methods enable real time in situ (in contrast to in vivo) detection applications.

  19. Associations between Input and Outcome Variables in an Online High School Bioinformatics Instructional Program

    NASA Astrophysics Data System (ADS)

    Lownsbery, Douglas S.

    Quantitative data from a completed year of an innovative online high school bioinformatics instructional program were analyzed as part of a descriptive research study. The online instructional program provided the opportunity for high school students to develop content understandings of molecular genetics and to use sophisticated bioinformatics tools and methodologies to conduct authentic research. Quantitative data were analyzed to identify potential associations between independent program variables including implementation setting, gender, and student educational backgrounds and dependent variables indicating success in the program including completion rates for analyzing DNA clones and performance gains from pre-to-post assessments of bioinformatics knowledge. Study results indicate that understanding associations between student educational backgrounds and level of success may be useful for structuring collaborative learning groups and enhancing scaffolding and support during the program to promote higher levels of success for participating students.

  20. Parallel algorithm research on several important open problems in bioinformatics.

    PubMed

    Niu, Bei-Fang; Lang, Xian-Yu; Lu, Zhong-Hua; Chi, Xue-Bin

    2009-09-01

    High performance computing has opened the door to using bioinformatics and systems biology to explore complex relationships among data, and created the opportunity to tackle very large and involved simulations of biological systems. Many supercomputing centers have jumped on the bandwagon because the opportunities for significant impact in this field is infinite. Development of new algorithms, especially parallel algorithms and software to mine new biological information and to assess different relationships among the members of a large biological data set, is becoming very important. This article presents our work on the design and development of parallel algorithms and software to solve some important open problems arising from bioinformatics, such as structure alignment of RNA sequences, finding new genes, alternative splicing, gene expression clustering and so on. In order to make these parallel software available to a wide audience, the grid computing service interfaces to these software have been deployed in China National Grid (CNGrid). Finally, conclusions and some future research directions are presented. PMID:20640837

  1. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  2. Evolving Strategies for the Incorporation of Bioinformatics within the Undergraduate Cell Biology Curriculum

    ERIC Educational Resources Information Center

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…

  3. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    ERIC Educational Resources Information Center

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  4. Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.

    1996-01-01

    Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.

  5. Bioinformatics-Driven New Immune Target Discovery in Disease.

    PubMed

    Yang, C; Chen, P; Zhang, W; Du, H

    2016-08-01

    Biomolecular network analysis has been widely applied in the discovery of cancer driver genes and molecular mechanism anatomization of many diseases on the genetic level. However, the application of such approach in the potential antigen discovery of autoimmune diseases remains largely unexplored. Here, we describe a previously uncharacterized region, with disease-associated autoantigens, to build antigen networks with three bioinformatics tools, namely NetworkAnalyst, GeneMANIA and ToppGene. First, we identified histone H2AX as an antigen of systemic lupus erythematosus by comparing highly ranked genes from all the built network-derived gene lists, and then a new potential biomarker for Behcet's disease, heat shock protein HSP 90-alpha (HSP90AA1), was further screened out. Moreover, 130 confirmed patients were enrolled and a corresponding enzyme-linked immunosorbent assay, mass spectrum analysis and immunoprecipitation were performed to further confirm the bioinformatics results with real-world clinical samples in succession. Our findings demonstrate that the combination of multiple molecular network approaches is a promising tool to discover new immune targets in diseases. PMID:27226232

  6. Shared bioinformatics databases within the Unipro UGENE platform.

    PubMed

    Protsyuk, Ivan V; Grekhov, German A; Tiunov, Alexey V; Fursov, Mikhail Y

    2015-01-01

    Unipro UGENE is an open-source bioinformatics toolkit that integrates popular tools along with original instruments for molecular biologists within a unified user interface. Nowadays, most bioinformatics desktop applications, including UGENE, make use of a local data model while processing different types of data. Such an approach causes an inconvenience for scientists working cooperatively and relying on the same data. This refers to the need of making multiple copies of certain files for every workplace and maintaining synchronization between them in case of modifications. Therefore, we focused on delivering a collaborative work into the UGENE user experience. Currently, several UGENE installations can be connected to a designated shared database and users can interact with it simultaneously. Such databases can be created by UGENE users and be used at their discretion. Objects of each data type, supported by UGENE such as sequences, annotations, multiple alignments, etc., can now be easily imported from or exported to a remote storage. One of the main advantages of this system, compared to existing ones, is the almost simultaneous access of client applications to shared data regardless of their volume. Moreover, the system is capable of storing millions of objects. The storage itself is a regular database server so even an inexpert user is able to deploy it. Thus, UGENE may provide access to shared data for users located, for example, in the same laboratory or institution. UGENE is available at: http://ugene.net/download.html. PMID:26527191

  7. InCoB2010 - 9th International Conference on Bioinformatics at Tokyo, Japan, September 26-28, 2010

    PubMed Central

    2010-01-01

    The International Conference on Bioinformatics (InCoB), the annual conference of the Asia-Pacific Bioinformatics Network (APBioNet), is hosted in one of countries of the Asia-Pacific region. The 2010 conference was awarded to Japan and has attracted more than one hundred high-quality research paper submissions. Thorough peer reviewing resulted in 47 (43.5%) accepted papers out of 108 submissions. Submissions from Japan, R.O. Korea, P.R. China, Australia, Singapore and U.S.A totaled 43.8% and contributed to 57.4% of accepted papers. Manuscripts originating from Taiwan and India added up to 42.8% of submissions and 28.3% of acceptances. The fifteen articles published in this BMC Bioinformatics supplement cover disease informatics, structural bioinformatics and drug design, biological databases and software tools, signaling pathways, gene regulatory and biochemical networks, evolution and sequence analysis. PMID:21106116

  8. Comparative structural and optical properties of different ceria nanoparticles.

    PubMed

    Nikolic, A S; Boskovic, M; Fabian, M; Bozanic, D K; Vucinic-Vasic, M; Kremenovic, A; Antic, B

    2013-10-01

    Herein a comparative study of five nanocrystalline cerium oxides (CeO(2-delta)) synthesised by different methods and calcined at 500 degrees C is reported. XRPD analysis showed that stoichiometry parameter delta, crystallite size/strain and lattice constant were only slightly affected by the method utilized. All ceria nanoparticles are nearly spherical in shape with faceted morphology, free of defects and with a relatively uniform size distribution. The average microstrain was found to be approximately 10 times higher than that of bulk counterpart. The absorption edge of nanocrystalline materials was shifted towards a higher wavelengths (red shift) in comparison with bulk counterpart, and band gap values were in the range 2.7-3.24 eV (3.33 eV for bulk counterpart). PMID:24245144

  9. Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

    PubMed

    Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

    2016-01-01

    One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. PMID:27238013

  10. Is racism dead? Comparing (expressive) means and (structural equation) models.

    PubMed

    Leach, C W; Peng, T R; Volckens, J

    2000-09-01

    Much scholarship suggests that racism--belief in out-group inferiority--is unrelated to contemporary attitudes. Purportedly, a new form of racism, one which relies upon a belief in cultural difference, has become a more acceptable basis for such attitudes. The authors argue that an appropriate empirical assessment of racism (both 'old' and 'new') depends upon (1) clear conceptualization and operationalization, and (2) attention to both mean-level expression and explanatory value in structural equation models. This study assessed the endorsement of racism and belief in cultural difference as well as their association with a measure of general attitude in a secondary analysis of parallel representative surveys of attitudes toward different ethnic out-groups in France, The Netherlands, Western Germany and Britain (N = 3242; see Reif & Melich, 1991). For six of the seven out-group targets, racism was strongly related to ethnic majority attitudes, despite low mean-level endorsement. In a pattern consistent with a 'new', indirect racism, the relationship between British racism and attitudes toward Afro-Caribbeans was mediated by belief in cultural difference. PMID:11041013

  11. Translational Bioinformatics Approaches to Drug Development

    PubMed Central

    Readhead, Ben; Dudley, Joel

    2013-01-01

    Significance A majority of therapeutic interventions occur late in the pathological process, when treatment outcome can be less predictable and effective, highlighting the need for new precise and preventive therapeutic development strategies that consider genomic and environmental context. Translational bioinformatics is well positioned to contribute to the many challenges inherent in bridging this gap between our current reactive methods of healthcare delivery and the intent of precision medicine, particularly in the areas of drug development, which forms the focus of this review. Recent Advances A variety of powerful informatics methods for organizing and leveraging the vast wealth of available molecular measurements available for a broad range of disease contexts have recently emerged. These include methods for data driven disease classification, drug repositioning, identification of disease biomarkers, and the creation of disease network models, each with significant impacts on drug development approaches. Critical Issues An important bottleneck in the application of bioinformatics methods in translational research is the lack of investigators who are versed in both biomedical domains and informatics. Efforts to nurture both sets of competencies within individuals and to increase interfield visibility will help to accelerate the adoption and increased application of bioinformatics in translational research. Future Directions It is possible to construct predictive, multiscale network models of disease by integrating genotype, gene expression, clinical traits, and other multiscale measures using causal network inference methods. This can enable the identification of the “key drivers” of pathology, which may represent novel therapeutic targets or biomarker candidates that play a more direct role in the etiology of disease. PMID:24527359

  12. Comparative connective tissue structure-function relationships in biologic pumps.

    PubMed

    Factor, S M; Robinson, T F

    1988-02-01

    A complex connective tissue framework exists in mammalian hearts that surrounds and interconnects individual myocytes and fascicles of cells. Recent evidence suggests that this connective tissue plays a role in maintaining shape, modulating contractile forces, and mediating elastic recoil during cavity filling and contraction. In order to analyze the involvement of connective tissue in pump contraction and recoil, we examined silver impregnated connective tissue in rat hearts which spontaneously jet through fluid ex vivo by contracting their cavities forcefully and then sucking fluid for the next cycle, and compared them to frog hearts which beat actively under the same conditions, but do not demonstrate jet propulsion. A further analysis was carried out in unrelated but analogous models: the squid and octopus. The former jets rapidly through the ocean, while the latter moves sinuously along the seabed. We observed highly interconnected myocytes in the rat heart, whereas frog myocytes are individually wrapped by connective tissue but are not interconnected. The squid mantle muscle is surrounded by a complex connective tissue grid that is tethered to each muscle cell, whereas the octopus mantle muscle cells are surrounded by connective tissue but are not tethered. These observations suggest that myocyte connective tissue tethering may be necessary for muscle cavities to generate forceful and coordinated contractions sufficient for rapid ejection and suction of fluid. PMID:2448546

  13. Hospital profitability and capital structure: a comparative analysis.

    PubMed Central

    Valvona, J; Sloan, F A

    1988-01-01

    This article compares the financial performance of hospitals by ownership type and of five publicly traded hospital companies with other industries, using such indicators as profit margins, return on equity (ROE) and total capitalization, and debt-to-equity ratios. We also examine stock returns to investors for the five hospital companies versus other industries, as well as the relative roles of debt and equity in new financing. Investor-owned hospitals had substantially greater margins and ROE than did other hospital types. In 1982, investor-owned chain hospitals had a ROE of 26 percent, 18 points above the average for all hospitals. Stock returns on the five selected hospital companies were more than twice as large as returns on other industries between 1972 and 1983. However, after 1983, returns for these companies fell dramatically in absolute terms and relative to other industries. We also found investor-owned hospitals to be much more highly levered than their government and voluntary counterparts, and more highly levered than other industries as well. PMID:3403274

  14. Microbial bioinformatics for food safety and production

    PubMed Central

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel

    2016-01-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput ‘omics’ technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety. PMID:26082168

  15. Critical Issues in Bioinformatics and Computing

    PubMed Central

    Kesh, Someswa; Raghupathi, Wullianallur

    2004-01-01

    This article provides an overview of the field of bioinformatics and its implications for the various participants. Next-generation issues facing developers (programmers), users (molecular biologists), and the general public (patients) who would benefit from the potential applications are identified. The goal is to create awareness and debate on the opportunities (such as career paths) and the challenges such as privacy that arise. A triad model of the participants' roles and responsibilities is presented along with the identification of the challenges and possible solutions. PMID:18066389

  16. Translational Bioinformatics: Past, Present, and Future

    PubMed Central

    Tenenbaum, Jessica D.

    2016-01-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contextualization of the term TBI, describes the discipline’s brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field. PMID:26876718

  17. Mobyle: a new full web bioinformatics framework

    PubMed Central

    Néron, Bertrand; Ménager, Hervé; Maufrais, Corinne; Joly, Nicolas; Maupetit, Julien; Letort, Sébastien; Carrere, Sébastien; Tuffery, Pierre; Letondal, Catherine

    2009-01-01

    Motivation: For the biologist, running bioinformatics analyses involves a time-consuming management of data and tools. Users need support to organize their work, retrieve parameters and reproduce their analyses. They also need to be able to combine their analytic tools using a safe data flow software mechanism. Finally, given that scientific tools can be difficult to install, it is particularly helpful for biologists to be able to use these tools through a web user interface. However, providing a web interface for a set of tools raises the problem that a single web portal cannot offer all the existing and possible services: it is the user, again, who has to cope with data copy among a number of different services. A framework enabling portal administrators to build a network of cooperating services would therefore clearly be beneficial. Results: We have designed a system, Mobyle, to provide a flexible and usable Web environment for defining and running bioinformatics analyses. It embeds simple yet powerful data management features that allow the user to reproduce analyses and to combine tools using a hierarchical typing system. Mobyle offers invocation of services distributed over remote Mobyle servers, thus enabling a federated network of curated bioinformatics portals without the user having to learn complex concepts or to install sophisticated software. While being focused on the end user, the Mobyle system also addresses the need, for the bioinfomatician, to automate remote services execution: PlayMOBY is a companion tool that automates the publication of BioMOBY web services, using Mobyle program definitions. Availability: The Mobyle system is distributed under the terms of the GNU GPLv2 on the project web site (http://bioweb2.pasteur.fr/projects/mobyle/). It is already deployed on three servers: http://mobyle.pasteur.fr, http://mobyle.rpbs.univ-paris-diderot.fr and http://lipm-bioinfo.toulouse.inra.fr/Mobyle. The PlayMOBY companion is distributed under the

  18. Bioinformatics in proteomics: application, terminology, and pitfalls.

    PubMed

    Wiemer, Jan C; Prokudin, Alexander

    2004-01-01

    Bioinformatics applies data mining, i.e., modern computer-based statistics, to biomedical data. It leverages on machine learning approaches, such as artificial neural networks, decision trees and clustering algorithms, and is ideally suited for handling huge data amounts. In this article, we review the analysis of mass spectrometry data in proteomics, starting with common pre-processing steps and using single decision trees and decision tree ensembles for classification. Special emphasis is put on the pitfall of overfitting, i.e., of generating too complex single decision trees. Finally, we discuss the pros and cons of the two different decision tree usages. PMID:15237926

  19. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    NASA Technical Reports Server (NTRS)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  20. The potential of translational bioinformatics approaches for pharmacology research.

    PubMed

    Li, Lang

    2015-10-01

    The field of bioinformatics has allowed the interpretation of massive amounts of biological data, ushering in the era of 'omics' to biomedical research. Its potential impact on pharmacology research is enormous and it has shown some emerging successes. A full realization of this potential, however, requires standardized data annotation for large health record databases and molecular data resources. Improved standardization will further stimulate the development of system pharmacology models, using translational bioinformatics methods. This new translational bioinformatics paradigm is highly complementary to current pharmacological research fields, such as personalized medicine, pharmacoepidemiology and drug discovery. In this review, I illustrate the application of transformational bioinformatics to research in numerous pharmacology subdisciplines. PMID:25753093

  1. OpenHelix: bioinformatics education outside of a different box.

    PubMed

    Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C

    2010-11-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181

  2. Translational Bioinformatics: Linking the Molecular World to the Clinical World

    PubMed Central

    Altman, RB

    2014-01-01

    Translational bioinformatics represents the union of translational medicine and bioinformatics. Translational medicine moves basic biological discoveries from the research bench into the patient-care setting and uses clinical observations to inform basic biology. It focuses on patient care, including the creation of new diagnostics, prognostics, prevention strategies, and therapies based on biological discoveries. Bioinformatics involves algorithms to represent, store, and analyze basic biological data, including DNA sequence, RNA expression, and protein and small-molecule abundance within cells. Translational bioinformatics spans these two fields; it involves the development of algorithms to analyze basic molecular and cellular data with an explicit goal of affecting clinical care. PMID:22549287

  3. OpenHelix: bioinformatics education outside of a different box

    PubMed Central

    Mangan, Mary E.; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C.

    2010-01-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181

  4. Tools and collaborative environments for bioinformatics research

    PubMed Central

    Giugno, Rosalba; Pulvirenti, Alfredo

    2011-01-01

    Advanced research requires intensive interaction among a multitude of actors, often possessing different expertise and usually working at a distance from each other. The field of collaborative research aims to establish suitable models and technologies to properly support these interactions. In this article, we first present the reasons for an interest of Bioinformatics in this context by also suggesting some research domains that could benefit from collaborative research. We then review the principles and some of the most relevant applications of social networking, with a special attention to networks supporting scientific collaboration, by also highlighting some critical issues, such as identification of users and standardization of formats. We then introduce some systems for collaborative document creation, including wiki systems and tools for ontology development, and review some of the most interesting biological wikis. We also review the principles of Collaborative Development Environments for software and show some examples in Bioinformatics. Finally, we present the principles and some examples of Learning Management Systems. In conclusion, we try to devise some of the goals to be achieved in the short term for the exploitation of these technologies. PMID:21984743

  5. ExPASy: SIB bioinformatics resource portal

    PubMed Central

    Artimo, Panu; Jonnalagedda, Manohar; Arnold, Konstantin; Baratin, Delphine; Csardi, Gabor; de Castro, Edouard; Duvaud, Séverine; Flegel, Volker; Fortier, Arnaud; Gasteiger, Elisabeth; Grosdidier, Aurélien; Hernandez, Céline; Ioannidis, Vassilios; Kuznetsov, Dmitry; Liechti, Robin; Moretti, Sébastien; Mostaguir, Khaled; Redaschi, Nicole; Rossier, Grégoire; Xenarios, Ioannis; Stockinger, Heinz

    2012-01-01

    ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a ‘decentralized’ way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across ‘selected’ resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy. PMID:22661580

  6. Data Compression Concepts and Algorithms and their Applications to Bioinformatics

    PubMed Central

    Nalbantog̃lu, Ö. U.; Russell, D.J.; Sayood, K.

    2009-01-01

    Data compression at its base is concerned with how information is organized in data. Understanding this organization can lead to efficient ways of representing the information and hence data compression. In this paper we review the ways in which ideas and approaches fundamental to the theory and practice of data compression have been used in the area of bioinformatics. We look at how basic theoretical ideas from data compression, such as the notions of entropy, mutual information, and complexity have been used for analyzing biological sequences in order to discover hidden patterns, infer phylogenetic relationships between organisms and study viral populations. Finally, we look at how inferred grammars for biological sequences have been used to uncover structure in biological sequences. PMID:20157640

  7. Serial analysis of gene expression (SAGE): unraveling the bioinformatics tools.

    PubMed

    Tuteja, Renu; Tuteja, Narendra

    2004-08-01

    Serial analysis of gene expression (SAGE) is a powerful technique that can be used for global analysis of gene expression. Its chief advantage over other methods is that it does not require prior knowledge of the genes of interest and provides qualitative and quantitative data of potentially every transcribed sequence in a particular cell or tissue type. This is a technique of expression profiling, which permits simultaneous, comparative and quantitative analysis of gene-specific, 9- to 13-basepair sequences. These short sequences, called SAGE tags, are linked together for efficient sequencing. The sequencing data are then analyzed to identify each gene expressed in the cell and the levels at which each gene is expressed. The main benefit of SAGE includes the digital output and the identification of novel genes. In this review, we present an outline of the method, various bioinformatics methods for data analysis and general applications of this important technology. PMID:15273993

  8. From Jobs to Work: Scheduling the Right Bioinformatics Tools

    PubMed Central

    Ries, James E.; Patrick, Timothy B.; Springer, Gordon K.

    2002-01-01

    A great deal of effort has been expended toward scheduling computationally intensive jobs on Grids1,2 and other collections of high performance computing resources. Bioinformatics computer jobs are of particular interest as they are often highly computationally intensive. However, the problem has not been addressed from the viewpoint of the overall work that should be done. Here, we make a distinction between jobs and work. Jobs are specifically bound computational tasks (e.g., a request to run NCBI's BLAST tool or the GCG FASTA program) versus work requests, which are more general (e.g., a request to compare a set of sequences for similarity). We contend that biology researchers often wish to accomplish work rather than run a particular job. With this idea in mind, it is possible to improve resource usage by mapping work to jobs with the goal of choosing appropriate jobs that can best be scheduled at a given time.

  9. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN

    PubMed Central

    2010-01-01

    Background Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. Results VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Conclusions Bioinformatics curation and ontological

  10. Assessment of a Bioinformatics across Life Science Curricula Initiative

    ERIC Educational Resources Information Center

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  11. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    ERIC Educational Resources Information Center

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  12. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    PubMed

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  13. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    ERIC Educational Resources Information Center

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  14. Bioinformatics Resources for MicroRNA Discovery

    PubMed Central

    Moore, Alyssa C.; Winkjer, Jonathan S.; Tseng, Tsai-Tien

    2015-01-01

    Biomarker identification is often associated with the diagnosis and evaluation of various diseases. Recently, the role of microRNA (miRNA) has been implicated in the development of diseases, particularly cancer. With the advent of next-generation sequencing, the amount of data on miRNA has increased tremendously in the last decade, requiring new bioinformatics approaches for processing and storing new information. New strategies have been developed in mining these sequencing datasets to allow better understanding toward the actions of miRNAs. As a result, many databases have also been established to disseminate these findings. This review focuses on several curated databases of miRNAs and their targets from both predicted and validated sources. PMID:26819547

  15. Survey: Translational Bioinformatics embraces Big Data

    PubMed Central

    Shah, Nigam H.

    2015-01-01

    Summary We review the latest trends and major developments in translational bioinformatics in the year 2011–2012. Our emphasis is on highlighting the key events in the field and pointing at promising research areas for the future. The key take-home points are: Translational informatics is ready to revolutionize human health and healthcare using large-scale measurements on individuals.Data–centric approaches that compute on massive amounts of data (often called “Big Data”) to discover patterns and to make clinically relevant predictions will gain adoption.Research that bridges the latest multimodal measurement technologies with large amounts of electronic healthcare data is increasing; and is where new breakthroughs will occur. PMID:22890354

  16. Bioinformatics Analysis of Estrogen-Responsive Genes.

    PubMed

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  17. Biology and bioinformatics of myeloma cell.

    PubMed

    Abroun, Saeid; Saki, Najmaldin; Fakher, Rahim; Asghari, Farahnaz

    2012-12-01

    Multiple myeloma (MM) is a plasma cell disorder that occurs in about 10% of all hematologic cancers. The majority of patients (99%) are over 50 years of age when diagnosed. In the bone marrow (BM), stromal and hematopoietic stem cells (HSCs) are responsible for the production of blood cells. Therefore any destruction or/and changes within the BM undesirably impacts a wide range of hematopoiesis, causing diseases and influencing patient survival. In order to establish an effective therapeutic strategy, recognition of the biology and evaluation of bioinformatics models for myeloma cells are necessary to assist in determining suitable methods to cure or prevent disease complications in patients. This review presents the evaluation of molecular and cellular aspects of MM such as genetic translocation, genetic analysis, cell surface marker, transcription factors, and chemokine signaling pathways. It also briefly reviews some of the mechanisms involved in MM in order to develop a better understanding for use in future studies. PMID:23253865

  18. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    PubMed Central

    2011-01-01

    Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive

  19. Comparing the three-dimensional structures of Dicistroviridae IGR IRES RNAs with other viral RNA structures.

    PubMed

    Kieft, Jeffrey S

    2009-02-01

    The intergenic region (IGR) internal ribosome entry site (IRES) RNAs do not require any of the canonical translation initiation factors to recruit the ribosome to the viral RNA, they eliminate the need for initiator tRNA, and they begin translation from the A-site. The function of these IRESs depends on a specific three-dimensional folded RNA structure. Thus, a complete understanding of the mechanisms of action of these IRESs requires that we understand their structure in detail. Recently, the structures of both domains of the IGR IRES RNAs were solved by X-ray crystallography, providing the first glimpse into an entire IRES RNA structure. Here, I present an analysis of these structures, emphasizing how the structures explain many aspects of IGR IRES function, discussing how these structures have similarities to motifs found in other viral RNAs, and illustrating how these structures give rise to new mechanistic hypotheses. PMID:18672012

  20. Bioinformatics approaches to single-cell analysis in developmental biology.

    PubMed

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. PMID:26358759

  1. 4273π: Bioinformatics education on low cost ARM hardware

    PubMed Central

    2013-01-01

    Background Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. Results We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012–2013. Conclusions 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost. PMID:23937194

  2. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers

    PubMed Central

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2016-01-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025

  3. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    PubMed

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025

  4. Structure of haptoglobin heavy chain and other serine protease homologs by comparative model building

    SciTech Connect

    Grer, J.

    1980-10-01

    Proteins often occur in families whose structure is closely similar, even though the proteins may come from widely different sources and have quite distinct functions. It would be useful to be able to construct the three-dimensional structure of these proteins from the known structure of one or more of them without having to solve the structure of each protein ab initio. We have been using comparative model building to derive the structure of an unusual protein of the trypsin-like serine protease family. We have recently extended this comparison to include other serine protease homologs for which a primary structure is available. To generate structures for the different members of the serine protease family, it is necessary to extract the common structural features of the molecule. Fortunately, three independently determined protein structures are available: schymotrypsin, trypsin, and elastase. These three structures were compared in detail and the structurally conserved regions in all three, mainly the BETA-sheet and the ..cap alpha..-helix, were identified. The variable portions occur in the loops on the surface of the molecule. By using these structures, the primary sequences of these three proteins were aligned. From this alignment, it is clear that sequence homology between the proteins occurs mainly in the structurally conserved regions of the molecule, while the variable portions show very little sequence homology.

  5. Online Tools for Bioinformatics Analyses in Nutrition Sciences12

    PubMed Central

    Malkaram, Sridhar A.; Hassan, Yousef I.; Zempleni, Janos

    2012-01-01

    Recent advances in “omics” research have resulted in the creation of large datasets that were generated by consortiums and centers, small datasets that were generated by individual investigators, and bioinformatics tools for mining these datasets. It is important for nutrition laboratories to take full advantage of the analysis tools to interrogate datasets for information relevant to genomics, epigenomics, transcriptomics, proteomics, and metabolomics. This review provides guidance regarding bioinformatics resources that are currently available in the public domain, with the intent to provide a starting point for investigators who want to take advantage of the opportunities provided by the bioinformatics field. PMID:22983844

  6. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  7. Bioinformatics construction of the human cell surfaceome

    PubMed Central

    da Cunha, J. P. C.; Galante, P. A. F.; de Souza, J. E.; de Souza, R. F.; Carvalho, P. M.; Ohara, D. T.; Moura, R. P.; Oba-Shinja, S. M.; Marie, S. K. N.; Silva, W. A.; Perez, R. O.; Stransky, B.; Pieprzyk, M.; Moore, J.; Caballero, O.; Gama-Rodrigues, J.; Habr-Gama, A.; Kuo, W. P.; Simpson, A. J.; Camargo, A. A.; Old, Lloyd J.; de Souza, S. J.

    2009-01-01

    Cell surface proteins are excellent targets for diagnostic and therapeutic interventions. By using bioinformatics tools, we generated a catalog of 3,702 transmembrane proteins located at the surface of human cells (human cell surfaceome). We explored the genetic diversity of the human cell surfaceome at different levels, including the distribution of polymorphisms, conservation among eukaryotic species, and patterns of gene expression. By integrating expression information from a variety of sources, we were able to identify surfaceome genes with a restricted expression in normal tissues and/or differential expression in tumors, important characteristics for putative tumor targets. A high-throughput and efficient quantitative real-time PCR approach was used to validate 593 surfaceome genes selected on the basis of their expression pattern in normal and tumor samples. A number of candidates were identified as potential diagnostic and therapeutic targets for colorectal tumors and glioblastoma. Several candidate genes were also identified as coding for cell surface cancer/testis antigens. The human cell surfaceome will serve as a reference for further studies aimed at characterizing tumor targets at the surface of human cells. PMID:19805368

  8. Bioinformatic tools for microRNA dissection

    PubMed Central

    Akhtar, Most Mauluda; Micolucci, Luigina; Islam, Md Soriful; Olivieri, Fabiola; Procopio, Antonio Domenico

    2016-01-01

    Recently, microRNAs (miRNAs) have emerged as important elements of gene regulatory networks. MiRNAs are endogenous single-stranded non-coding RNAs (∼22-nt long) that regulate gene expression at the post-transcriptional level. Through pairing with mRNA, miRNAs can down-regulate gene expression by inhibiting translation or stimulating mRNA degradation. In some cases they can also up-regulate the expression of a target gene. MiRNAs influence a variety of cellular pathways that range from development to carcinogenesis. The involvement of miRNAs in several human diseases, particularly cancer, makes them potential diagnostic and prognostic biomarkers. Recent technological advances, especially high-throughput sequencing, have led to an exponential growth in the generation of miRNA-related data. A number of bioinformatic tools and databases have been devised to manage this growing body of data. We analyze 129 miRNA tools that are being used in diverse areas of miRNA research, to assist investigators in choosing the most appropriate tools for their needs. PMID:26578605

  9. Parallel evolutionary computation in bioinformatics applications.

    PubMed

    Pinho, Jorge; Sobral, João Luis; Rocha, Miguel

    2013-05-01

    A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. PMID:23127284

  10. Bacterial bioinformatics: pathogenesis and the genome.

    PubMed

    Paine, Kelly; Flower, Darren R

    2002-07-01

    As the number of completed microbial genome sequences continues to grow, there is a pressing need for the exploitation of this wealth of data through a synergistic interaction between the well-established science of bacteriology and the emergent discipline of bioinformatics. Antibiotic resistance and pathogenicity in virulent bacteria has become an increasing problem, with even the strongest drugs useless against some species, such as multi-drug resistant Enterococcus faecium and Mycobacterium tuberculosis. The global spread of Human Immunodeficiency Virus (HIV) and Acquired Immune Deficiency Syndrome (AIDS) has contributed to the re-emergence of tuberculosis and the threat from new and emergent diseases. To address these problems, bacterial pathogenicity requires redefinition as Koch's postulates become obsolete. This review discusses how the use of bacterial genomic information, and the in silico tools available at present, may aid in determining the definition of a current pathogen. The combination of both fields should provide a rapid and efficient way of assisting in the future development of antimicrobial therapies. PMID:12125816

  11. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  12. Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization

    PubMed Central

    2011-01-01

    Background Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. Results The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Conclusion Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net. PMID:21846353

  13. Multilevel Structural Equation Models for the Analysis of Comparative Data on Educational Performance

    ERIC Educational Resources Information Center

    Goldstein, Harvey; Bonnet, Gerard; Rocher, Thierry

    2007-01-01

    The Programme for International Student Assessment comparative study of reading performance among 15-year-olds is reanalyzed using statistical procedures that allow the full complexity of the data structures to be explored. The article extends existing multilevel factor analysis and structural equation models and shows how this can extract richer…

  14. Creating Bioinformatic Workflows within the BioExtract Server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows generally require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community...

  15. Bioinformatics opportunities for identification and study of medicinal plants

    PubMed Central

    Sharma, Vivekanand

    2013-01-01

    Plants have been used as a source of medicine since historic times and several commercially important drugs are of plant-based origin. The traditional approach towards discovery of plant-based drugs often times involves significant amount of time and expenditure. These labor-intensive approaches have struggled to keep pace with the rapid development of high-throughput technologies. In the era of high volume, high-throughput data generation across the biosciences, bioinformatics plays a crucial role. This has generally been the case in the context of drug designing and discovery. However, there has been limited attention to date to the potential application of bioinformatics approaches that can leverage plant-based knowledge. Here, we review bioinformatics studies that have contributed to medicinal plants research. In particular, we highlight areas in medicinal plant research where the application of bioinformatics methodologies may result in quicker and potentially cost-effective leads toward finding plant-based remedies. PMID:22589384

  16. [An overview of feature selection algorithm in bioinformatics].

    PubMed

    Li, Xin; Ma, Li; Wang, Jinjia; Zhao, Chun

    2011-04-01

    Feature selection (FS) techniques have become an important tool in bioinformatics field. The core algorithm of it is to select the hidden significant data with low-dimension from high-dimensional data space, and thus to analyse the basic built-in rule of the data. The data of bioinformatics fields are always with high-dimension and small samples, so the research of FS algorithm in the bioinformatics fields has great foreground. In this article, we make the interested reader aware of the possibilities of feature selection, provide basic properties of feature selection techniques, and discuss their uses in the sequence analysis, microarray analysis, mass spectra analysis etc. Finally, the current problems and the prospects of feature selection algorithm in the application of bioinformatics is also discussed. PMID:21604512

  17. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    PubMed Central

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  18. Whale song analyses using bioinformatics sequence analysis approaches

    NASA Astrophysics Data System (ADS)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  19. Skate Genome Project: Cyber-Enabled Bioinformatics Collaboration

    PubMed Central

    Vincent, J.

    2011-01-01

    The Skate Genome Project, a pilot project of the North East Cyber infrastructure Consortium, aims to produce a draft genome sequence of Leucoraja erinacea, the Little Skate. The pilot project was designed to also develop expertise in large scale collaborations across the NECC region. An overview of the bioinformatics and infrastructure challenges faced during the first year of the project will be presented. Results to date and lessons learned from the perspective of a bioinformatics core will be highlighted.

  20. The 2015 Bioinformatics Open Source Conference (BOSC 2015)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J. A.; Lapp, Hilmar

    2016-01-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included “Data Science;” “Standards and Interoperability;” “Open Science and Reproducibility;” “Translational Bioinformatics;” “Visualization;” and “Bioinformatics Open Source Project Updates”. In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled “Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community,” that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  1. Monte Carlo modelling of photodynamic therapy treatments comparing clustered three dimensional tumour structures with homogeneous tissue structures.

    PubMed

    Campbell, C L; Wood, K; Brown, C T A; Moseley, H

    2016-07-01

    We explore the effects of three dimensional (3D) tumour structures on depth dependent fluence rates, photodynamic doses (PDD) and fluorescence images through Monte Carlo radiation transfer modelling of photodynamic therapy. The aim with this work was to compare the commonly used uniform tumour densities with non-uniform densities to determine the importance of including 3D models in theoretical investigations. It was found that fractal 3D models resulted in deeper penetration on average of therapeutic radiation and higher PDD. An increase in effective treatment depth of 1 mm was observed for one of the investigated fractal structures, when comparing to the equivalent smooth model. Wide field fluorescence images were simulated, revealing information about the relationship between tumour structure and the appearance of the fluorescence intensity. Our models indicate that the 3D tumour structure strongly affects the spatial distribution of therapeutic light, the PDD and the wide field appearance of surface fluorescence images. PMID:27273196

  2. Monte Carlo modelling of photodynamic therapy treatments comparing clustered three dimensional tumour structures with homogeneous tissue structures

    NASA Astrophysics Data System (ADS)

    Campbell, C. L.; Wood, K.; Brown, C. T. A.; Moseley, H.

    2016-07-01

    We explore the effects of three dimensional (3D) tumour structures on depth dependent fluence rates, photodynamic doses (PDD) and fluorescence images through Monte Carlo radiation transfer modelling of photodynamic therapy. The aim with this work was to compare the commonly used uniform tumour densities with non-uniform densities to determine the importance of including 3D models in theoretical investigations. It was found that fractal 3D models resulted in deeper penetration on average of therapeutic radiation and higher PDD. An increase in effective treatment depth of 1 mm was observed for one of the investigated fractal structures, when comparing to the equivalent smooth model. Wide field fluorescence images were simulated, revealing information about the relationship between tumour structure and the appearance of the fluorescence intensity. Our models indicate that the 3D tumour structure strongly affects the spatial distribution of therapeutic light, the PDD and the wide field appearance of surface fluorescence images.

  3. Systems Biology, Bioinformatics, and Biomarkers in Neuropsychiatry

    PubMed Central

    Alawieh, Ali; Zaraket, Fadi A.; Li, Jian-Liang; Mondello, Stefania; Nokkari, Amaly; Razafsha, Mahdi; Fadlallah, Bilal; Boustany, Rose-Mary; Kobeissy, Firas H.

    2012-01-01

    Although neuropsychiatric (NP) disorders are among the top causes of disability worldwide with enormous financial costs, they can still be viewed as part of the most complex disorders that are of unknown etiology and incomprehensible pathophysiology. The complexity of NP disorders arises from their etiologic heterogeneity and the concurrent influence of environmental and genetic factors. In addition, the absence of rigid boundaries between the normal and diseased state, the remarkable overlap of symptoms among conditions, the high inter-individual and inter-population variations, and the absence of discriminative molecular and/or imaging biomarkers for these diseases makes difficult an accurate diagnosis. Along with the complexity of NP disorders, the practice of psychiatry suffers from a “top-down” method that relied on symptom checklists. Although checklist diagnoses cost less in terms of time and money, they are less accurate than a comprehensive assessment. Thus, reliable and objective diagnostic tools such as biomarkers are needed that can detect and discriminate among NP disorders. The real promise in understanding the pathophysiology of NP disorders lies in bringing back psychiatry to its biological basis in a systemic approach which is needed given the NP disorders’ complexity to understand their normal functioning and response to perturbation. This approach is implemented in the systems biology discipline that enables the discovery of disease-specific NP biomarkers for diagnosis and therapeutics. Systems biology involves the use of sophisticated computer software “omics”-based discovery tools and advanced performance computational techniques in order to understand the behavior of biological systems and identify diagnostic and prognostic biomarkers specific for NP disorders together with new targets of therapeutics. In this review, we try to shed light on the need of systems biology, bioinformatics, and biomarkers in neuropsychiatry, and

  4. Expanding our Understanding of Sequence-Function Relationships of Type II Polyketide Biosynthetic Gene Clusters: Bioinformatics-Guided Identification of Frankiamicin A from Frankia sp. EAN1pec

    PubMed Central

    Ogasawara, Yasushi; Yackley, Benjamin J.; Greenberg, Jacob A.; Rogelj, Snezna; Melançon, Charles E.

    2015-01-01

    A large and rapidly increasing number of unstudied “orphan” natural product biosynthetic gene clusters are being uncovered in sequenced microbial genomes. An important goal of modern natural products research is to be able to accurately predict natural product structures and biosynthetic pathways from these gene cluster sequences. This requires both development of bioinformatic methods for global analysis of these gene clusters and experimental characterization of select products produced by gene clusters with divergent sequence characteristics. Here, we conduct global bioinformatic analysis of all available type II polyketide gene cluster sequences and identify a conserved set of gene clusters with unique ketosynthase α/β sequence characteristics in the genomes of Frankia species, a group of Actinobacteria with underexploited natural product biosynthetic potential. Through LC-MS profiling of extracts from several Frankia species grown under various conditions, we identified Frankia sp. EAN1pec as producing a compound with spectral characteristics consistent with the type II polyketide produced by this gene cluster. We isolated the compound, a pentangular polyketide which we named frankiamicin A, and elucidated its structure by NMR and labeled precursor feeding. We also propose biosynthetic and regulatory pathways for frankiamicin A based on comparative genomic analysis and literature precedent, and conduct bioactivity assays of the compound. Our findings provide new information linking this set of Frankia gene clusters with the compound they produce, and our approach has implications for accurate functional prediction of the many other type II polyketide clusters present in bacterial genomes. PMID:25837682

  5. Expanding our understanding of sequence-function relationships of type II polyketide biosynthetic gene clusters: bioinformatics-guided identification of Frankiamicin A from Frankia sp. EAN1pec.

    PubMed

    Ogasawara, Yasushi; Yackley, Benjamin J; Greenberg, Jacob A; Rogelj, Snezna; Melançon, Charles E

    2015-01-01

    A large and rapidly increasing number of unstudied "orphan" natural product biosynthetic gene clusters are being uncovered in sequenced microbial genomes. An important goal of modern natural products research is to be able to accurately predict natural product structures and biosynthetic pathways from these gene cluster sequences. This requires both development of bioinformatic methods for global analysis of these gene clusters and experimental characterization of select products produced by gene clusters with divergent sequence characteristics. Here, we conduct global bioinformatic analysis of all available type II polyketide gene cluster sequences and identify a conserved set of gene clusters with unique ketosynthase α/β sequence characteristics in the genomes of Frankia species, a group of Actinobacteria with underexploited natural product biosynthetic potential. Through LC-MS profiling of extracts from several Frankia species grown under various conditions, we identified Frankia sp. EAN1pec as producing a compound with spectral characteristics consistent with the type II polyketide produced by this gene cluster. We isolated the compound, a pentangular polyketide which we named frankiamicin A, and elucidated its structure by NMR and labeled precursor feeding. We also propose biosynthetic and regulatory pathways for frankiamicin A based on comparative genomic analysis and literature precedent, and conduct bioactivity assays of the compound. Our findings provide new information linking this set of Frankia gene clusters with the compound they produce, and our approach has implications for accurate functional prediction of the many other type II polyketide clusters present in bacterial genomes. PMID:25837682

  6. The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics

    PubMed Central

    Steinbeck, Christoph; Han, Yongquan; Kuhn, Stefan; Horlacher, Oliver; Luttmann, Edgar; Willighagen, Egon

    2003-01-01

    The Chemistry Development Kit (CDK) is a freely available open-source Java library for Structural Chemo-and Bioinformatics. Its architecture and capabilities as well as the development as an open-source project by a team of international collaborators from academic and industrial institutions is described. The CDK provides methods for many common tasks in molecular informatics, including 2D and 3D rendering of chemical structures, I/O routines, SMILES parsing and generation, ring searches, isomorphism checking, structure diagram generation, etc. Application scenarios as well as access information for interested users and potential contributors are given. PMID:12653513

  7. Comparative structural analysis of the caspase family with other clan CD cysteine peptidases

    PubMed Central

    McLuskey, Karen; Mottram, Jeremy C.

    2015-01-01

    Clan CD forms a structural group of cysteine peptidases, containing seven individual families and two subfamilies of structurally related enzymes. Historically, it is most notable for containing the mammalian caspases, on which the structures of the clan were founded. Interestingly, the caspase family is split into two subfamilies: the caspases, and a second subfamily containing both the paracaspases and the metacaspases. Structural data are now available for both the paracaspases and the metacaspases, allowing a comprehensive structural analysis of the entire caspase family. In addition, a relative plethora of structural data has recently become available for many of the other families in the clan, allowing both the structures and the structure–function relationships of clan CD to be fully explored. The present review compares the enzymes in the caspase subfamilies with each other, together with a comprehensive comparison of all the structural families in clan CD. This reveals a diverse group of structures with highly conserved structural elements that provide the peptidases with a variety of substrate specificities and activation mechanisms. It also reveals conserved structural elements involved in substrate binding, and potential autoinhibitory functions, throughout the clan, and confirms that the metacaspases are structurally diverse from the caspases (and paracaspases), suggesting that they should form a distinct family of clan CD peptidases. PMID:25697094

  8. Atlas – a data warehouse for integrative bioinformatics

    PubMed Central

    Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire MS; Ling, John; Ouellette, BF Francis

    2005-01-01

    Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of

  9. Proteomic and bioinformatic analyses of spinal cord injury‑induced skeletal muscle atrophy in rats.

    PubMed

    Wei, Zhi-Jian; Zhou, Xian-Hu; Fan, Bao-You; Lin, Wei; Ren, Yi-Ming; Feng, Shi-Qing

    2016-07-01

    Spinal cord injury (SCI) may result in skeletal muscle atrophy. Identifying diagnostic biomarkers and effective targets for treatment is an important challenge in clinical work. The aim of the present study is to elucidate potential biomarkers and therapeutic targets for SCI‑induced muscle atrophy (SIMA) using proteomic and bioinformatic analyses. The protein samples from rat soleus muscle were collected at different time points following SCI injury and separated by two‑dimensional gel electrophoresis and compared with the sham group. The identities of these protein spots were analyzed by mass spectrometry (MS). MS demonstrated that 20 proteins associated with muscle atrophy were differentially expressed. Bioinformatic analyses indicated that SIMA changed the expression of proteins associated with cellular, developmental, immune system and metabolic processes, biological adhesion and localization. The results of the present study may be beneficial in understanding the molecular mechanisms of SIMA and elucidating potential biomarkers and targets for the treatment of muscle atrophy. PMID:27177391

  10. Proteomic and bioinformatic analyses of spinal cord injury-induced skeletal muscle atrophy in rats

    PubMed Central

    WEI, ZHI-JIAN; ZHOU, XIAN-HU; FAN, BAO-YOU; LIN, WEI; REN, YI-MING; FENG, SHI-QING

    2016-01-01

    Spinal cord injury (SCI) may result in skeletal muscle atrophy. Identifying diagnostic biomarkers and effective targets for treatment is an important challenge in clinical work. The aim of the present study is to elucidate potential biomarkers and therapeutic targets for SCI-induced muscle atrophy (SIMA) using proteomic and bioinformatic analyses. The protein samples from rat soleus muscle were collected at different time points following SCI injury and separated by two-dimensional gel electrophoresis and compared with the sham group. The identities of these protein spots were analyzed by mass spectrometry (MS). MS demonstrated that 20 proteins associated with muscle atrophy were differentially expressed. Bioinformatic analyses indicated that SIMA changed the expression of proteins associated with cellular, developmental, immune system and metabolic processes, biological adhesion and localization. The results of the present study may be beneficial in understanding the molecular mechanisms of SIMA and elucidating potential biomarkers and targets for the treatment of muscle atrophy. PMID:27177391

  11. Bioinformatic Characterization of Glycyl Radical Enzyme-Associated Bacterial Microcompartments

    PubMed Central

    Zarzycki, Jan; Erbilgin, Onur

    2015-01-01

    Bacterial microcompartments (BMCs) are proteinaceous organelles encapsulating enzymes that catalyze sequential reactions of metabolic pathways. BMCs are phylogenetically widespread; however, only a few BMCs have been experimentally characterized. Among them are the carboxysomes and the propanediol- and ethanolamine-utilizing microcompartments, which play diverse metabolic and ecological roles. The substrate of a BMC is defined by its signature enzyme. In catabolic BMCs, this enzyme typically generates an aldehyde. Recently, it was shown that the most prevalent signature enzymes encoded by BMC loci are glycyl radical enzymes, yet little is known about the function of these BMCs. Here we characterize the glycyl radical enzyme-associated microcompartment (GRM) loci using a combination of bioinformatic analyses and active-site and structural modeling to show that the GRMs comprise five subtypes. We predict distinct functions for the GRMs, including the degradation of choline, propanediol, and fuculose phosphate. This is the first family of BMCs for which identification of the signature enzyme is insufficient for predicting function. The distinct GRM functions are also reflected in differences in shell composition and apparently different assembly pathways. The GRMs are the counterparts of the vitamin B12-dependent propanediol- and ethanolamine-utilizing BMCs, which are frequently associated with virulence. This study provides a comprehensive foundation for experimental investigations of the diverse roles of GRMs. Understanding this plasticity of function within a single BMC family, including characterization of differences in permeability and assembly, can inform approaches to BMC bioengineering and the design of therapeutics. PMID:26407889

  12. Bioinformatic characterization of glycyl radical enzyme-associated bacterial microcompartments.

    PubMed

    Zarzycki, Jan; Erbilgin, Onur; Kerfeld, Cheryl A

    2015-12-01

    Bacterial microcompartments (BMCs) are proteinaceous organelles encapsulating enzymes that catalyze sequential reactions of metabolic pathways. BMCs are phylogenetically widespread; however, only a few BMCs have been experimentally characterized. Among them are the carboxysomes and the propanediol- and ethanolamine-utilizing microcompartments, which play diverse metabolic and ecological roles. The substrate of a BMC is defined by its signature enzyme. In catabolic BMCs, this enzyme typically generates an aldehyde. Recently, it was shown that the most prevalent signature enzymes encoded by BMC loci are glycyl radical enzymes, yet little is known about the function of these BMCs. Here we characterize the glycyl radical enzyme-associated microcompartment (GRM) loci using a combination of bioinformatic analyses and active-site and structural modeling to show that the GRMs comprise five subtypes. We predict distinct functions for the GRMs, including the degradation of choline, propanediol, and fuculose phosphate. This is the first family of BMCs for which identification of the signature enzyme is insufficient for predicting function. The distinct GRM functions are also reflected in differences in shell composition and apparently different assembly pathways. The GRMs are the counterparts of the vitamin B12-dependent propanediol- and ethanolamine-utilizing BMCs, which are frequently associated with virulence. This study provides a comprehensive foundation for experimental investigations of the diverse roles of GRMs. Understanding this plasticity of function within a single BMC family, including characterization of differences in permeability and assembly, can inform approaches to BMC bioengineering and the design of therapeutics. PMID:26407889

  13. New insight into RNase P RNA structure from comparative analysis of the archaeal RNA.

    PubMed Central

    Harris, J K; Haas, E S; Williams, D; Frank, D N; Brown, J W

    2001-01-01

    A detailed comparative analysis of archaeal RNase P RNA structure and a comparison of the resulting structural information with that of the bacterial RNA reveals that the archaeal RNase P RNAs are strikingly similar to those of Bacteria. The differences between the secondary structure models of archaeal and bacterial RNase P RNA have largely disappeared, and even variation in the sequence and structure of the RNAs are similar in extent and type. The structure of the cruciform (P7-11) has been reevaluated on the basis of a total of 321 bacterial and archaeal sequences, leading to a model for the structure of this region of the RNA that includes an extension to P11 that consistently organizes the cruciform and adjacent highly-conserved sequences. PMID:11233979

  14. Integrating bioinformatics into senior high school: design principles and implications.

    PubMed

    Machluf, Yossy; Yarden, Anat

    2013-09-01

    Bioinformatics is an integral part of modern life sciences. It has revolutionized and redefined how research is carried out and has had an enormous impact on biotechnology, medicine, agriculture and related areas. Yet, it is only rarely integrated into high school teaching and learning programs, playing almost no role in preparing the next generation of information-oriented citizens. Here, we describe the design principles of bioinformatics learning environments, including our own, that are aimed at introducing bioinformatics into senior high school curricula through engaging learners in scientifically authentic inquiry activities. We discuss the bioinformatics-related benefits and challenges that high school teachers and students face in the course of the implementation process, in light of previous studies and our own experience. Based on these lessons, we present a new approach for characterizing the questions embedded in bioinformatics teaching and learning units, based on three criteria: the type of domain-specific knowledge required to answer each question (declarative knowledge, procedural knowledge, strategic knowledge, situational knowledge), the scientific approach from which each question stems (biological, bioinformatics, a combination of the two) and the associated cognitive process dimension (remember, understand, apply, analyze, evaluate, create). We demonstrate the feasibility of this approach using a learning environment, which we developed for the high school level, and suggest some of its implications. This review sheds light on unique and critical characteristics related to broader integration of bioinformatics in secondary education, which are also relevant to the undergraduate level, and especially on curriculum design, development of suitable learning environments and teaching and learning processes. PMID:23665511

  15. Life comparative analysis of energy consumption and CO₂ emissions of different building structural frame types.

    PubMed

    Kim, Sangyong; Moon, Joon-Ho; Shin, Yoonseok; Kim, Gwang-Hee; Seo, Deok-Seok

    2013-01-01

    The objective of this research is to quantitatively measure and compare the environmental load and construction cost of different structural frame types. Construction cost also accounts for the costs of CO₂ emissions of input materials. The choice of structural frame type is a major consideration in construction, as this element represents about 33% of total building construction costs. In this research, four constructed buildings were analyzed, with these having either reinforced concrete (RC) or steel (S) structures. An input-output framework analysis was used to measure energy consumption and CO₂ emissions of input materials for each structural frame type. In addition, the CO₂ emissions cost was measured using the trading price of CO₂ emissions on the International Commodity Exchange. This research revealed that both energy consumption and CO₂ emissions were, on average, 26% lower with the RC structure than with the S structure, and the construction costs (including the CO₂ emissions cost) of the RC structure were about 9.8% lower, compared to the S structure. This research provides insights through which the construction industry will be able to respond to the carbon market, which is expected to continue to grow in the future. PMID:24227998

  16. Comparative analysis of seismic response characteristics of pile-soil-structure interaction system

    NASA Astrophysics Data System (ADS)

    Kong, Desen; Luan, Maotian; Wang, Weiming

    2006-01-01

    The study on the earthquake-resistant performance of a pile-soil-structure interaction system is a relatively complicated and primarily important issue in civil engineering practice. In this paper, a computational model and computation procedures for pile-supported structures, which can duly consider the pile-soil interaction effect, are established by the finite element method. Numerical implementation is made in the time domain. A simplified approximation for the seismic response analysis of pile-soil-structure systems is briefly presented. Then a comparative study is performed for an engineering example with numerical results computed respectively by the finite element method and the simplified method. Through comparative analysis, it is shown that the results obtained by the simplified method well agree with those achieved by the finite element method. The numerical results and findings will offer instructive guidelines for earthquake-resistant analysis and design of pile-supported structures.

  17. A comparative overview of modal testing and system identification for control of structures

    NASA Technical Reports Server (NTRS)

    Juang, J.-N.; Pappa, R. S.

    1988-01-01

    A comparative overview is presented of the disciplines of modal testing used in structural engineering and system identification used in control theory. A list of representative references from both areas is given, and the basic methods are described briefly. Recent progress on the interaction of modal testing and control disciplines is discussed. It is concluded that combined efforts of researchers in both disciplines are required for unification of modal testing and system identification methods for control of flexible structures.

  18. Molecular docking of Glycine max and Medicago truncatula ureases with urea; bioinformatics approaches.

    PubMed

    Filiz, Ertugrul; Vatansever, Recep; Ozyigit, Ibrahim Ilker

    2016-03-01

    Urease (EC 3.5.1.5) is a nickel-dependent metalloenzyme catalyzing the hydrolysis of urea into ammonia and carbon dioxide. It is present in many bacteria, fungi, yeasts and plants. Most species, with few exceptions, use nickel metalloenzyme urease to hydrolyze urea, which is one of the commonly used nitrogen fertilizer in plant growth thus its enzymatic hydrolysis possesses vital importance in agricultural practices. Considering the essentiality and importance of urea and urease activity in most plants, this study aimed to comparatively investigate the ureases of two important legume species such as Glycine max (soybean) and Medicago truncatula (barrel medic) from Fabaceae family. With additional plant species, primary and secondary structures of 37 plant ureases were comparatively analyzed using various bioinformatics tools. A structure based phylogeny was constructed using predicted 3D models of G. max and M. truncatula, whose crystallographic structures are not available, along with three additional solved urease structures from Canavalia ensiformis (PDB: 4GY7), Bacillus pasteurii (PDB: 4UBP) and Klebsiella aerogenes (PDB: 1FWJ). In addition, urease structures of these species were docked with urea to analyze the binding affinities, interacting amino acids and atom distances in urease-urea complexes. Furthermore, mutable amino acids which could potentially affect the protein active site, stability and flexibility as well as overall protein stability were analyzed in urease structures of G. max and M. truncatula. Plant ureases demonstrated similar physico-chemical properties with 833-878 amino acid residues and 89.39-90.91 kDa molecular weight with mainly acidic (5.15-6.10 pI) nature. Four protein domain structures such as urease gamma, urease beta, urease alpha and amidohydro 1 characterized the plant ureases. Secondary structure of plant ureases also demonstrated conserved protein architecture, with predominantly α-helix and random coil structures. In

  19. Credibility Analysis of Putative Disease-Causing Genes Using Bioinformatics

    PubMed Central

    Abel, Olubunmi; Powell, John F.; Andersen, Peter M.; Al-Chalabi, Ammar

    2013-01-01

    Background Genetic studies are challenging in many complex diseases, particularly those with limited diagnostic certainty, low prevalence or of old age. The result is that genes may be reported as disease-causing with varying levels of evidence, and in some cases, the data may be so limited as to be indistinguishable from chance findings. When there are large numbers of such genes, an objective method for ranking the evidence is useful. Using the neurodegenerative and complex disease amyotrophic lateral sclerosis (ALS) as a model, and the disease-specific database ALSoD, the objective is to develop a method using publicly available data to generate a credibility score for putative disease-causing genes. Methods Genes with at least one publication suggesting involvement in adult onset familial ALS were collated following an exhaustive literature search. SQL was used to generate a score by extracting information from the publications and combined with a pathogenicity analysis using bioinformatics tools. The resulting score allowed us to rank genes in order of credibility. To validate the method, we compared the objective ranking with a rank generated by ALS genetics experts. Spearman's Rho was used to compare rankings generated by the different methods. Results The automated method ranked ALS genes in the following order: SOD1, TARDBP, FUS, ANG, SPG11, NEFH, OPTN, ALS2, SETX, FIG4, VAPB, DCTN1, TAF15, VCP, DAO. This compared very well to the ranking of ALS genetics experts, with Spearman's Rho of 0.69 (P = 0.009). Conclusion We have presented an automated method for scoring the level of evidence for a gene being disease-causing. In developing the method we have used the model disease ALS, but it could equally be applied to any disease in which there is genotypic uncertainty. PMID:23755159

  20. An agent-based multilayer architecture for bioinformatics grids.

    PubMed

    Bartocci, Ezio; Cacciagrano, Diletta; Cannata, Nicola; Corradini, Flavio; Merelli, Emanuela; Milanesi, Luciano; Romano, Paolo

    2007-06-01

    Due to the huge volume and complexity of biological data available today, a fundamental component of biomedical research is now in silico analysis. This includes modelling and simulation of biological systems and processes, as well as automated bioinformatics analysis of high-throughput data. The quest for bioinformatics resources (including databases, tools, and knowledge) becomes therefore of extreme importance. Bioinformatics itself is in rapid evolution and dedicated Grid cyberinfrastructures already offer easier access and sharing of resources. Furthermore, the concept of the Grid is progressively interleaving with those of Web Services, semantics, and software agents. Agent-based systems can play a key role in learning, planning, interaction, and coordination. Agents constitute also a natural paradigm to engineer simulations of complex systems like the molecular ones. We present here an agent-based, multilayer architecture for bioinformatics Grids. It is intended to support both the execution of complex in silico experiments and the simulation of biological systems. In the architecture a pivotal role is assigned to an "alive" semantic index of resources, which is also expected to facilitate users' awareness of the bioinformatics domain. PMID:17695749

  1. Vignettes: diverse library staff offering diverse bioinformatics services*

    PubMed Central

    Osterbur, David L.; Alpi, Kristine; Canevari, Catharine; Corley, Pamela M.; Devare, Medha; Gaedeke, Nicola; Jacobs, Donna K.; Kirlew, Peter; Ohles, Janet A.; Vaughan, K.T.L.; Wang, Lili; Wu, Yongchun; Geer, Renata C.

    2006-01-01

    Objectives: The paper gives examples of the bioinformatics services provided in a variety of different libraries by librarians with a broad range of educational background and training. Methods: Two investigators sent an email inquiry to attendees of the “National Center for Biotechnology Information's (NCBI) Introduction to Molecular Biology Information Resources” or “NCBI Advanced Workshop for Bioinformatics Information Specialists (NAWBIS)” courses. The thirty-five-item questionnaire addressed areas such as educational background, library setting, types and numbers of users served, and bioinformatics training and support services provided. Answers were compiled into program vignettes. Discussion: The bioinformatics support services addressed in the paper are based in libraries with academic and clinical settings. Services have been established through different means: in collaboration with biology faculty as part of formal courses, through teaching workshops in the library, through one-on-one consultations, and by other methods. Librarians with backgrounds from art history to doctoral degrees in genetics have worked to establish these programs. Conclusion: Successful bioinformatics support programs can be established in libraries in a variety of different settings and by staff with a variety of different backgrounds and approaches. PMID:16888664

  2. Structural Complexity of DNA Sequence

    PubMed Central

    Liou, Cheng-Yuan; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  3. Gender Differences in Structured Risk Assessment: Comparing the Accuracy of Five Instruments

    ERIC Educational Resources Information Center

    Coid, Jeremy; Yang, Min; Ullrich, Simone; Zhang, Tianqiang; Sizmur, Steve; Roberts, Colin; Farrington, David P.; Rogers, Robert D.

    2009-01-01

    Structured risk assessment should guide clinical risk management, but it is uncertain which instrument has the highest predictive accuracy among men and women. In the present study, the authors compared the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991, 2003); the Historical, Clinical, Risk Management-20 (HCR-20; C. D. Webster, K. S.…

  4. How Good Are Trainers' Personal Methods Compared to Two Structured Training Strategies?

    ERIC Educational Resources Information Center

    Walls, Richard T.; And Others

    Training methods naturally employed by trainers were analyzed and compared to systematic structured training procedures. Trainers were observed teaching retarded subjects how to assemble a bicycle brake, roller skate, carburetor, and lawn mower engine. Trainers first taught using their own (personal) method, which was recorded in terms of types of…

  5. Comparing Religious Education in Canadian and Australian Catholic High Schools: Identifying Some Key Structural Issues

    ERIC Educational Resources Information Center

    Rymarz, Richard

    2013-01-01

    Religious education (RE) in Catholic high schools in Australia and Canada is compared by examining some of the underlying structural factors that shape the delivery of RE. It is argued that in Canadian Catholic schools RE is diminished by three factors that distinguish it from the Australian experience. These are: the level and history of…

  6. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    PubMed

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  7. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis

    PubMed Central

    Noar, Roslyn D.; Daub, Margaret E.

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  8. Embracing the Future: Bioinformatics for High School Women

    NASA Astrophysics Data System (ADS)

    Zales, Charlotte Rappe; Cronin, Susan J.

    Sixteen high school women participated in a 5-week residential summer program designed to encourage female and minority students to choose careers in scientific fields. Students gained expertise in bioinformatics through problem-based learning in a complex learning environment of content instruction, speakers, labs, and trips. Innovative hands-on activities filled the program. Students learned biological principles in context and sophisticated bioinformatics tools for processing data. Students additionally mastered a variety of information-searching techniques. Students completed creative individual and group projects, demonstrating the successful integration of biology, information technology, and bioinformatics. Discussions with female scientists allowed students to see themselves in similar roles. Summer residential aspects fostered an atmosphere in which students matured in interacting with others and in their views of diversity.

  9. Using bioinformatics for drug target identification from the genome.

    PubMed

    Jiang, Zhenran; Zhou, Yanhong

    2005-01-01

    Genomics and proteomics technologies have created a paradigm shift in the drug discovery process, with bioinformatics having a key role in the exploitation of genomic, transcriptomic, and proteomic data to gain insights into the molecular mechanisms that underlie disease and to identify potential drug targets. We discuss the current state of the art for some of the bioinformatic approaches to identifying drug targets, including identifying new members of successful target classes and their functions, predicting disease relevant genes, and constructing gene networks and protein interaction networks. In addition, we introduce drug target discovery using the strategy of systems biology, and discuss some of the data resources for the identification of drug targets. Although bioinformatics tools and resources can be used to identify putative drug targets, validating targets is still a process that requires an understanding of the role of the gene or protein in the disease process and is heavily dependent on laboratory-based work. PMID:16336003

  10. Bioinformatics Projects Supporting Life-Sciences Learning in High Schools

    PubMed Central

    Marques, Isabel; Almeida, Paulo; Alves, Renato; Dias, Maria João; Godinho, Ana; Pereira-Leal, José B.

    2014-01-01

    The interdisciplinary nature of bioinformatics makes it an ideal framework to develop activities enabling enquiry-based learning. We describe here the development and implementation of a pilot project to use bioinformatics-based research activities in high schools, called “Bioinformatics@school.” It includes web-based research projects that students can pursue alone or under teacher supervision and a teacher training program. The project is organized so as to enable discussion of key results between students and teachers. After successful trials in two high schools, as measured by questionnaires, interviews, and assessment of knowledge acquisition, the project is expanding by the action of the teachers involved, who are helping us develop more content and are recruiting more teachers and schools. PMID:24465192

  11. Developing expertise in bioinformatics for biomedical research in Africa

    PubMed Central

    Karikari, Thomas K.; Quansah, Emmanuel; Mohamed, Wael M.Y.

    2015-01-01

    Research in bioinformatics has a central role in helping to advance biomedical research. However, its introduction to Africa has been met with some challenges (such as inadequate infrastructure, training opportunities, research funding, human resources, biorepositories and databases) that have contributed to the slow pace of development in this field across the continent. Fortunately, recent improvements in areas such as research funding, infrastructural support and capacity building are helping to develop bioinformatics into an important discipline in Africa. These contributions are leading to the establishment of world-class research facilities, biorepositories, training programmes, scientific networks and funding schemes to improve studies into disease and health in Africa. With increased contribution from all stakeholders, these developments could be further enhanced. Here, we discuss how the recent developments are contributing to the advancement of bioinformatics in Africa. PMID:26767162

  12. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    PubMed

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  13. Comparative research on the transmission-mode GaAs photocathodes of exponential-doping structures

    NASA Astrophysics Data System (ADS)

    Chen, Liang; Qian, Yun-Sheng; Zhang, Yi-Jun; Chang, Ben-Kang

    2012-03-01

    Early research has shown that the varied doping structures of the active layer of GaAs photocathodes have been proven to have a higher quantum efficiency than uniform doping structures. On the basis of our early research on the surface photovoltage of GaAs photocathodes, and comparative research before and after activation of reflection-mode GaAs photocathodes, we further the comparative research on transmission-mode GaAs photocathodes. An exponential doping structure is the typical varied doping structure that can form a uniform electric field in the active layer. By solving the one-dimensional diffusion equation for no equilibrium minority carriers of transmission-mode GaAs photocathodes of the exponential doping structure, we can obtain the equations for the surface photovoltage (SPV) curve before activation and the spectral response curve (SRC) after activation. Through experiments and fitting calculations for the designed material, the body-material parameters can be well fitted by the SPV before activation, and proven by the fitting calculation for SRC after activation. Through the comparative research before and after activation, the average surface escape probability (SEP) can also be well fitted. This comparative research method can measure the body parameters and the value of SEP for the transmission-mode GaAs photocathode more exactly than the early method, which only measures the body parameters by SRC after activation. It can also help us to deeply study and exactly measure the parameters of the varied doping structures for transmission-mode GaAs photocathodes, and optimize the Cs-O activation technique in the future.

  14. Statistical Power of Alternative Structural Models for Comparative Effectiveness Research: Advantages of Modeling Unreliability

    PubMed Central

    Iordache, Eugen; Dierker, Lisa; Fifield, Judith; Schensul, Jean J.; Suggs, Suzanne; Barbour, Russell

    2015-01-01

    The advantages of modeling the unreliability of outcomes when evaluating the comparative effectiveness of health interventions is illustrated. Adding an action-research intervention component to a regular summer job program for youth was expected to help in preventing risk behaviors. A series of simple two-group alternative structural equation models are compared to test the effect of the intervention on one key attitudinal outcome in terms of model fit and statistical power with Monte Carlo simulations. Some models presuming parameters equal across the intervention and comparison groups were underpowered to detect the intervention effect, yet modeling the unreliability of the outcome measure increased their statistical power and helped in the detection of the hypothesized effect. Comparative Effectiveness Research (CER) could benefit from flexible multi-group alternative structural models organized in decision trees, and modeling unreliability of measures can be of tremendous help for both the fit of statistical models to the data and their statistical power. PMID:26640421

  15. Structures, properties, and functions of the stings of honey bees and paper wasps: a comparative study

    PubMed Central

    Zhao, Zi-Long; Zhao, Hong-Ping; Ma, Guo-Jun; Wu, Cheng-Wei; Yang, Kai; Feng, Xi-Qiao

    2015-01-01

    ABSTRACT Through natural selection, many animal organs with similar functions have evolved different macroscopic morphologies and microscopic structures. Here, we comparatively investigate the structures, properties and functions of honey bee stings and paper wasp stings. Their elegant structures were systematically observed. To examine their behaviors of penetrating into different materials, we performed penetration–extraction tests and slow motion analyses of their insertion process. In comparison, the barbed stings of honey bees are relatively difficult to be withdrawn from fibrous tissues (e.g. skin), while the removal of paper wasp stings is easier due to their different structures and insertion skills. The similarities and differences of the two kinds of stings are summarized on the basis of the experiments and observations. PMID:26002929

  16. Comparative Application of Capacity Models for Seismic Vulnerability Evaluation of Existing RC Structures

    SciTech Connect

    Faella, C.; Lima, C.; Martinelli, E.; Nigro, E.

    2008-07-08

    Seismic vulnerability assessment of existing buildings is one of the most common tasks in which Structural Engineers are currently engaged. Since, its is often a preliminary step to approach the issue of how to retrofit non-seismic designed and detailed structures, it plays a key role in the successful choice of the most suitable strengthening technique. In this framework, the basic information for both seismic assessment and retrofitting is related to the formulation of capacity models for structural members. Plenty of proposals, often contradictory under the quantitative standpoint, are currently available within the technical and scientific literature for defining the structural capacity in terms of force and displacements, possibly with reference to different parameters representing the seismic response. The present paper shortly reviews some of the models for capacity of RC members and compare them with reference to two case studies assumed as representative of a wide class of existing buildings.

  17. Bioinformatic scaling of allosteric interactions in biomedical isozymes

    NASA Astrophysics Data System (ADS)

    Phillips, J. C.

    2016-09-01

    Allosteric (long-range) interactions can be surprisingly strong in proteins of biomedical interest. Here we use bioinformatic scaling to connect prior results on nonsteroidal anti-inflammatory drugs to promising new drugs that inhibit cancer cell metabolism. Many parallel features are apparent, which explain how even one amino acid mutation, remote from active sites, can alter medical results. The enzyme twins involved are cyclooxygenase (aspirin) and isocitrate dehydrogenase (IDH). The IDH results are accurate to 1% and are overdetermined by adjusting a single bioinformatic scaling parameter. It appears that the final stage in optimizing protein functionality may involve leveling of the hydrophobic limits of the arms of conformational hydrophilic hinges.

  18. Bioinformatic analysis of expression data to identify effector candidates.

    PubMed

    Reid, Adam J; Jones, John T

    2014-01-01

    Pathogens produce effectors that manipulate the host to the benefit of the pathogen. These effectors are often secreted proteins that are upregulated during the early phases of infection. These properties can be used to identify candidate effectors from genomes and transcriptomes of pathogens. Here we describe commonly used bioinformatic approaches that (1) allow identification of genes encoding predicted secreted proteins within a genome and (2) allow the identification of genes encoding predicted secreted proteins that are upregulated at important stages of the life cycle. Other approaches for bioinformatic identification of effector candidates, including OrthoMCL analysis to identify expanded gene families, are also described. PMID:24643549

  19. The influence of conceptual model structure on model performance: a comparative study for 237 French catchments

    NASA Astrophysics Data System (ADS)

    van Esse, W. R.; Perrin, C.; Booij, M. J.; Augustijn, D. C. M.; Fenicia, F.; Lobligeois, F.

    2013-04-01

    In hydrological studies models with a fixed structure are commonly used. For various reasons, these models do not always perform well. As an alternative, a flexible modelling approach could be followed, where the identification of the model structure is part of the model set-up procedure. In this study, the performance of twelve different conceptual model structures from the SUPERFLEX framework with varying complexity and the fixed model structure of GR4H were compared on a large set of 237 French catchments. The results showed that in general the flexible approach performs better than the fixed approach. However, the flexible approach has a higher chance of inconsistent results when implemented on two different periods. The same holds for more complex model structures. When for practical reasons a fixed model structure is preferred, this study shows that models with parallel reservoirs and a power function to describe the reservoir outflow perform best. In general, conceptual hydrological models perform better on large or wet catchments than on small or dry catchments. The model structures performed poorly when there was a climatic difference between the calibration and validation period, for catchments with flashy flows or disturbances in low flow measurements.

  20. Comparative Structural and Functional Analysis of Staphylococcus aureus Glucokinase with other Bacterial Glucokinases

    PubMed Central

    Kumar, P. S.; Kumar, Y. N.; Prasad, U. V.; Yeswanth, S.; Swarupa, V.; Vasu, D.; Venkatesh, K.; Srikanth, L.; Rao, V. K.; Sarma, P. V. G. K.

    2014-01-01

    Glucokinase is classified in bacteria based upon having ATP binding site and ‘repressor/open reading frames of unknown function/sugar kinases’ motif, the sequence of glucokinase gene (JN645812) of Staphylococcus aureus ATCC12600 showed presence of ATP binding site and ‘repressor/open reading frames of unknown function/sugar kinases’ motif. We have earlier observed glucokinase of S. aureus has higher affinity towards the substrate compared to other bacterial glucokinase and under anaerobic condition with increased glucose concentration S. aureus exhibited higher rate of biofilm formation. To establish this, 3D structure of glucokinase was built using homology modeling method, the PROCHECK and ProSA-Web analysis indicated this built glucokinase structure was close to the crystal structure. This structure was superimposed with different bacterial glucokinase structures and from the root-mean-square deviation values, it is concluded that S. aureus glucokinase exhibited very close homology with Enterococcus faecalis and Clostridium difficle while with other bacteria it showed high degree of variations both in domain and nondomain regions. Glucose docking results indicated -12.3697 kcal/mol for S. aureus glucokinase compared with other bacterial glucokinase suggesting higher affinity of glucose which correlates with enzyme kinetics and higher rate of biofilm formation. PMID:25425757

  1. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    PubMed

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  2. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    PubMed Central

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  3. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    SciTech Connect

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics into

  4. The influence of conceptual model structure on model performance: a comparative study for 237 French catchments

    NASA Astrophysics Data System (ADS)

    van Esse, W. R.; Perrin, C.; Booij, M. J.; Augustijn, D. C. M.; Fenicia, F.; Kavetski, D.; Lobligeois, F.

    2013-10-01

    Models with a fixed structure are widely used in hydrological studies and operational applications. For various reasons, these models do not always perform well. As an alternative, flexible modelling approaches allow the identification and refinement of the model structure as part of the modelling process. In this study, twelve different conceptual model structures from the SUPERFLEX framework are compared with the fixed model structure GR4H, using a large set of 237 French catchments and discharge-based performance metrics. The results show that, in general, the flexible approach performs better than the fixed approach. However, the flexible approach has a higher chance of inconsistent results when calibrated on two different periods. When analysing the subset of 116 catchments where the two approaches produce consistent performance over multiple time periods, their average performance relative to each other is almost equivalent. From the point of view of developing a well-performing fixed model structure, the findings favour models with parallel reservoirs and a power function to describe the reservoir outflow. In general, conceptual hydrological models perform better on larger and/or wetter catchments than on smaller and/or drier catchments. The model structures performed poorly when there were large climatic differences between the calibration and validation periods, in catchments with flashy flows, and in catchments with unexplained variations in low flow measurements.

  5. Comparative Analysis of the Macroscale Structural Connectivity in the Macaque and Human Brain

    PubMed Central

    Bezgin, Gleb; Uylings, Harry B. M.; Roebroeck, Alard; Stiers, Peter

    2014-01-01

    The macaque brain serves as a model for the human brain, but its suitability is challenged by unique human features, including connectivity reconfigurations, which emerged during primate evolution. We perform a quantitative comparative analysis of the whole brain macroscale structural connectivity of the two species. Our findings suggest that the human and macaque brain as a whole are similarly wired. A region-wise analysis reveals many interspecies similarities of connectivity patterns, but also lack thereof, primarily involving cingulate regions. We unravel a common structural backbone in both species involving a highly overlapping set of regions. This structural backbone, important for mediating information across the brain, seems to constitute a feature of the primate brain persevering evolution. Our findings illustrate novel evolutionary aspects at the macroscale connectivity level and offer a quantitative translational bridge between macaque and human research. PMID:24676052

  6. Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations

    PubMed Central

    Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W.

    2016-01-01

    Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures. PMID:26904094

  7. Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations.

    PubMed

    Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W

    2016-01-01

    Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures. PMID:26904094

  8. Bioinformatics analysis and expression study of fumarate hydratase in lung cancer

    PubMed Central

    Ming, Zongjuan; Jiang, Meihua; Li, Wei; Fan, Na; Deng, Wenjing; Zhong, Yujie; Zhang, Yuping; Zhang, Qiuhong; Yang, Shuanying

    2014-01-01

    Background As its etiology and pathogenesis is obscure, illustrating the molecular mechanism of lung cancer has become a serious and urgent task. Studies have shown that fumarate hydratase (FH) is a tumor suppressor related to tumorigenesis, development, and invasion. Our aim was to analyze the biological information of FH, and detect the messenger ribonucleic acid (mRNA) and protein expression of FH in lung cancer cells to explore its role in tumorigenesis and in the development of lung cancer. Method We analyzed the biological characteristics of FH, then utilized reverse transcription-polymerase chain reaction (RT-PCR) to study FH mRNA expression in A549 and 16 human bronchial epithelial (HBE) cell lines. The protein expression of FH was detected in 57 cases of human lung cancer tissues and 19 cases of normal lung tissues by immunohistochemistry. Results 1. Bioinformatic analysis: FH mainly exist in the mitochondria; the common structural elements of FH are mainly α-helix, random coil, β-turn, and extended strand; there are five possible transmembrane domains in the entire polypeptide chain; FH is a hydrophilic and soluble protein. 2. RT-PCR result: FH mRNA expression was downregulated in A549 cells compared with 16HBE cells. 3. Immunohistochemistry: FH protein expression was significantly lower in lung cancer cells than in normal lung tissues (P < 0.05), but was not correlated with the patients' age, gender, tumor size, pathological type, or lymph node, distant, or tumor node metastasis stage. Conclusion FH was under-expressed in lung cancer, suggesting that it may be an indicator of tumorigenesis and could be a potential target for therapies against lung cancer in the future. PMID:26767050

  9. Bioinformatics education--perspectives and challenges out of Africa.

    PubMed

    Tastan Bishop, Özlem; Adebiyi, Ezekiel F; Alzohairy, Ahmed M; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J; Panji, Sumir; Patterton, Hugh-G

    2015-03-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  10. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    ERIC Educational Resources Information Center

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  11. An International Bioinformatics Infrastructure to Underpin the Arabidopsis Community

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering C...

  12. Bioinformatics Education—Perspectives and Challenges out of Africa

    PubMed Central

    Adebiyi, Ezekiel F.; Alzohairy, Ahmed M.; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J.; Panji, Sumir; Patterton, Hugh-G.

    2015-01-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  13. Bioinformatic-driven search for metabolic biomarkers in disease

    PubMed Central

    2011-01-01

    The search and validation of novel disease biomarkers requires the complementary power of professional study planning and execution, modern profiling technologies and related bioinformatics tools for data analysis and interpretation. Biomarkers have considerable impact on the care of patients and are urgently needed for advancing diagnostics, prognostics and treatment of disease. This survey article highlights emerging bioinformatics methods for biomarker discovery in clinical metabolomics, focusing on the problem of data preprocessing and consolidation, the data-driven search, verification, prioritization and biological interpretation of putative metabolic candidate biomarkers in disease. In particular, data mining tools suitable for the application to omic data gathered from most frequently-used type of experimental designs, such as case-control or longitudinal biomarker cohort studies, are reviewed and case examples of selected discovery steps are delineated in more detail. This review demonstrates that clinical bioinformatics has evolved into an essential element of biomarker discovery, translating new innovations and successes in profiling technologies and bioinformatics to clinical application. PMID:21884622

  14. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    EPA Science Inventory

    A Bioinformatic Strategy to Rapidly Characterize cDNA Libraries

    G. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.
    1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  15. Incorporation of Bioinformatics Exercises into the Undergraduate Biochemistry Curriculum

    ERIC Educational Resources Information Center

    Feig, Andrew L.; Jabri, Evelyn

    2002-01-01

    The field of bioinformatics is developing faster than most biochemistry textbooks can adapt. Supplementing the undergraduate biochemistry curriculum with data-mining exercises is an ideal way to expose the students to the common databases and tools that take advantage of this vast repository of biochemical information. An integrated collection of…

  16. Omics-bioinformatics in the context of clinical data.

    PubMed

    Mayer, Gert; Heinze, Georg; Mischak, Harald; Hellemons, Merel E; Heerspink, Hiddo J Lambers; Bakker, Stephan J L; de Zeeuw, Dick; Haiduk, Martin; Rossing, Peter; Oberbauer, Rainer

    2011-01-01

    The Omics revolution has provided the researcher with tools and methodologies for qualitative and quantitative assessment of a wide spectrum of molecular players spanning from the genome to the meta-bolome level. As a consequence, explorative analysis (in contrast to purely hypothesis driven research procedures) has become applicable. However, numerous issues have to be considered for deriving meaningful results from Omics, and bioinformatics has to respect these in data analysis and interpretation. Aspects include sample type and quality, concise definition of the (clinical) question, and selection of samples ideally coming from thoroughly defined sample and data repositories. Omics suffers from a principal shortcoming, namely unbalanced sample-to-feature matrix denoted as "curse of dimensionality", where a feature refers to a specific gene or protein among the many thousands assayed in parallel in an Omics experiment. This setting makes the identification of relevant features with respect to a phenotype under analysis error prone from a statistical perspective. From this sample size calculation for screening studies and for verification of results from Omics, bioinformatics is essential. Here we present key elements to be considered for embedding Omics bioinformatics in a quality controlled workflow for Omics screening, feature identification, and validation. Relevant items include sample and clinical data management, minimum sample quality requirements, sample size estimates, and statistical procedures for computing the significance of findings from Omics bioinformatics in validation studies. PMID:21370098

  17. Bioinformatics for Undergraduates: Steps toward a Quantitative Bioscience Curriculum

    ERIC Educational Resources Information Center

    Chapman, Barbara S.; Christmann, James L.; Thatcher, Eileen F.

    2006-01-01

    We describe an innovative bioinformatics course developed under grants from the National Science Foundation and the California State University Program in Research and Education in Biotechnology for undergraduate biology students. The project has been part of a continuing effort to offer students classroom experiences focused on principles and…

  18. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    ERIC Educational Resources Information Center

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  19. Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics.

    PubMed

    Verheggen, Kenneth; Maddelein, Davy; Hulstaert, Niels; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2016-03-01

    The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries. PMID:26510693

  20. Population genetic structure of economically important Tortricidae (Lepidoptera) in South Africa: a comparative analysis.

    PubMed

    Timm, A E; Geertsema, H; Warnich, L

    2010-08-01

    Comparative studies of the population genetic structures of agricultural pests can elucidate the factors by which their population levels are affected, which is useful for designing pest management programs. This approach was used to provide insight into the six Tortricidae of major economic importance in South Africa. The population genetic structure of the carnation worm E. acerbella and the false codling moth T. leucotreta, analyzed using amplified fragment length polymorphism (AFLP) analysis, is presented here for the first time. These results were compared with those obtained previously for the codling moth Cydia pomonella, the oriental fruit moth Grapholita molesta, the litchi moth Cryptophlebia peltastica and the macadamia nut borer T. batrachopa. Locally adapted populations were detected over local geographic areas for all species. No significant differences were found among population genetic structures as result of population history (whether native or introduced) although host range (whether oligophagous or polyphagous) had a small but significant effect. It is concluded that factors such as dispersal ability and agricultural practices have the most important effects on genetically structuring populations of the economically important Tortricidae in South Africa. PMID:19941674

  1. Comparative 3D Genome Structure Analysis of the Fission and the Budding Yeast

    PubMed Central

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species. PMID:25799503

  2. SiteBinder: an improved approach for comparing multiple protein structural motifs.

    PubMed

    Sehnal, David; Vařeková, Radka Svobodová; Huber, Heinrich J; Geidl, Stanislav; Ionescu, Crina-Maria; Wimmerová, Michaela; Koča, Jaroslav

    2012-02-27

    There is a paramount need to develop new techniques and tools that will extract as much information as possible from the ever growing repository of protein 3D structures. We report here on the development of a software tool for the multiple superimposition of large sets of protein structural motifs. Our superimposition methodology performs a systematic search for the atom pairing that provides the best fit. During this search, the RMSD values for all chemically relevant pairings are calculated by quaternion algebra. The number of evaluated pairings is markedly decreased by using PDB annotations for atoms. This approach guarantees that the best fit will be found and can be applied even when sequence similarity is low or does not exist at all. We have implemented this methodology in the Web application SiteBinder, which is able to process up to thousands of protein structural motifs in a very short time, and which provides an intuitive and user-friendly interface. Our benchmarking analysis has shown the robustness, efficiency, and versatility of our methodology and its implementation by the successful superimposition of 1000 experimentally determined structures for each of 32 eukaryotic linear motifs. We also demonstrate the applicability of SiteBinder using three case studies. We first compared the structures of 61 PA-IIL sugar binding sites containing nine different sugars, and we found that the sugar binding sites of PA-IIL and its mutants have a conserved structure despite their binding different sugars. We then superimposed over 300 zinc finger central motifs and revealed that the molecular structure in the vicinity of the Zn atom is highly conserved. Finally, we superimposed 12 BH3 domains from pro-apoptotic proteins. Our findings come to support the hypothesis that there is a structural basis for the functional segregation of BH3-only proteins into activators and enablers. PMID:22296449

  3. Effect of vegetation structure on subcanopy solar radiation: a comparative study

    NASA Astrophysics Data System (ADS)

    Anand, A.; Dubayah, R.; Hofton, M. A.

    2012-12-01

    Vertical structure of vegetation canopy influences spatial variability of radiation regime under forest canopies. A comparison of transmittance profiles and subcanopy radiation regime for two structurally different forest sites is done based on ray tracing and principles of radiative transfer using Lidar data. Medium footprint waveform Lidar data from Laser Vegetation Imaging Sensor (LVIS) was collected from the sites in Sierra National Forest (SNF), California and Smithsonian Environmental Research Center (SERC), Maryland in 2008 and 2003 respectively. Sites in both forest areas have varying vegetation structure with SNF sites representing mixed conifers whereas the sites in SERC represent eastern broadleaf trees. The Lidar waveform is processed to derive canopy gap probability as a function of height which is used to derive transmittance profiles and solar radiation as a function of canopy height using a 3-D light transmittance model. Geostatistics is applied to compare how the vertical and horizontal distribution of solar radiation under sub-canopy surface varies with varying vertical canopy structures such as foliage density, canopy cover and canopy height. This comparison is expected to increase knowledge on vegetation structure effects forest canopies.

  4. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    ERIC Educational Resources Information Center

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  5. Comparative evaluation of structured oil systems: Shellac oleogel, HPMC oleogel, and HIPE gel

    PubMed Central

    Patel, Ashok R; Dewettinck, Koen

    2015-01-01

    In lipid-based food products, fat crystals are used as building blocks for creating a crystalline network that can trap liquid oil into a 3D gel-like structure which in turn is responsible for the desirable mouth feel and texture properties of the food products. However, the recent ban on the use of trans-fat in the US, coupled with the increasing concerns about the negative health effects of saturated fat consumption, has resulted in an increased interest in the area of identifying alternative ways of structuring edible oils using non-fat-based building blocks. In this paper, we give a brief account of three alternative approaches where oil structuring was carried out using wax crystals (shellac), polymer strands (hydrophilic cellulose derivative), and emulsion droplets as structurants. These building blocks resulted in three different types of oleogels that showed distinct rheological properties and temperature functionalities. The three approaches are compared in terms of the preparation process (ease of processing), properties of the formed systems (microstructure, rheological gel strength, temperature response, effect of water incorporation, and thixotropic recovery), functionality, and associated limitations of the structured systems. The comparative evaluation is made such that the new researchers starting their work in the area of oil structuring can use this discussion as a general guideline. Practical applications Various aspects of oil binding for three different building blocks were studied in this work. The practical significance of this study includes (i) information on the preparation process and the concentrations of structuring agents required for efficient gelation and (ii) information on the behavior of oleogels to temperature, applied shear, and presence of water. This information can be very useful for selecting the type of structuring agents keeping the final applications in mind. For detailed information on the actual edible applications

  6. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    ERIC Educational Resources Information Center

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  7. Design and Implementation of an Interdepartmental Bioinformatics Program across Life Science Curricula

    ERIC Educational Resources Information Center

    Miskowski, Jennifer A.; Howard, David R.; Abler, Michael L.; Grunwald, Sandra K.

    2007-01-01

    Over the past 10 years, there has been a technical revolution in the life sciences leading to the emergence of a new discipline called bioinformatics. In response, bioinformatics-related topics have been incorporated into various undergraduate courses along with the development of new courses solely focused on bioinformatics. This report describes…

  8. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    ERIC Educational Resources Information Center

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  9. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    ERIC Educational Resources Information Center

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  10. Comparability of a Three-Dimensional Structure in Biopharmaceuticals Using Spectroscopic Methods

    PubMed Central

    Abad-Javier, Mario E.; Romero-Díaz, Alexis J.; Villaseñor-Ortega, Francisco; Pérez, Néstor O.; Flores-Ortiz, Luis F.

    2014-01-01

    Protein structure depends on weak interactions and covalent bonds, like disulfide bridges, established according to the environmental conditions. Here, we present the validation of two spectroscopic methodologies for the measurement of free and unoxidized thiols, as an attribute of structural integrity, using 5,5′-dithionitrobenzoic acid (DTNB) and DyLight Maleimide (DLM) as derivatizing agents. These methods were used to compare Rituximab and Etanercept products from different manufacturers. Physicochemical comparability was demonstrated for Rituximab products as DTNB showed no statistical differences under native, denaturing, and denaturing-reducing conditions, with Student's t-test P values of 0.6233, 0.4022, and 0.1475, respectively. While for Etanercept products no statistical differences were observed under native (P = 0.0758) and denaturing conditions (P = 0.2450), denaturing-reducing conditions revealed cysteine contents of 98% and 101%, towards the theoretical value of 58, for the evaluated products from different Etanercept manufacturers. DLM supported equality between Rituximab products under native (P = 0.7499) and denaturing conditions (P = 0.8027), but showed statistical differences among Etanercept products under native conditions (P < 0.001). DLM suggested that Infinitam has fewer exposed thiols than Enbrel, although DTNB method, circular dichroism (CD), fluorescence (TCSPC), and activity (TNFα neutralization) showed no differences. Overall, this data revealed the capabilities and drawbacks of each thiol quantification technique and their correlation with protein structure. PMID:24963443

  11. Comparability of a three-dimensional structure in biopharmaceuticals using spectroscopic methods.

    PubMed

    Pérez Medina Martínez, Víctor; Abad-Javier, Mario E; Romero-Díaz, Alexis J; Villaseñor-Ortega, Francisco; Pérez, Néstor O; Flores-Ortiz, Luis F; Medina-Rivero, Emilio

    2014-01-01

    Protein structure depends on weak interactions and covalent bonds, like disulfide bridges, established according to the environmental conditions. Here, we present the validation of two spectroscopic methodologies for the measurement of free and unoxidized thiols, as an attribute of structural integrity, using 5,5'-dithionitrobenzoic acid (DTNB) and DyLight Maleimide (DLM) as derivatizing agents. These methods were used to compare Rituximab and Etanercept products from different manufacturers. Physicochemical comparability was demonstrated for Rituximab products as DTNB showed no statistical differences under native, denaturing, and denaturing-reducing conditions, with Student's t-test P values of 0.6233, 0.4022, and 0.1475, respectively. While for Etanercept products no statistical differences were observed under native (P = 0.0758) and denaturing conditions (P = 0.2450), denaturing-reducing conditions revealed cysteine contents of 98% and 101%, towards the theoretical value of 58, for the evaluated products from different Etanercept manufacturers. DLM supported equality between Rituximab products under native (P = 0.7499) and denaturing conditions (P = 0.8027), but showed statistical differences among Etanercept products under native conditions (P < 0.001). DLM suggested that Infinitam has fewer exposed thiols than Enbrel, although DTNB method, circular dichroism (CD), fluorescence (TCSPC), and activity (TNF α neutralization) showed no differences. Overall, this data revealed the capabilities and drawbacks of each thiol quantification technique and their correlation with protein structure. PMID:24963443

  12. Should we have blind faith in bioinformatics software? Illustrations from the SNAP web-based tool.

    PubMed

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any 'false positive' SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen's Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. PMID:25742008

  13. Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool

    PubMed Central

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N.; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. PMID:25742008

  14. Structure-reactivity relationship of Amadori rearrangement products compared to related ketoses.

    PubMed

    Kaufmann, Martin; Meissner, Philipp M; Pelke, Daniel; Mügge, Clemens; Kroh, Lothar W

    2016-06-16

    Structure-reactivity relationships of Amadori rearrangement products compared to their related ketoses were derived from multiple NMR spectroscopic techniques. Besides structure elucidation of six Amadori rearrangement products derived from d-glucose and d-galactose with l-alanine, l-phenylalanine and l-proline, especially quantitative (13)C selective saturation transfer NMR spectroscopy was applied to deduce information on isomeric systems. It could be shown exemplarily that the Amadori compound N-(1-deoxy-d-fructos-1-yl)-l-proline exhibits much higher isomerisation rates than d-fructose, which can be explained by C-1 substituent mediated intramolecular catalysis. In combination with a reduced carbonyl activity of Amadori compounds compared to their related ketoses which results in an increased acyclic keto isomer concentration, the results on isomerisation dynamics lead to a highly significant increased reactivity of Amadori compounds. This can be clearly seen, comparing approximated carbohydrate milieu stability time constants (ACuSTiC) which is 1 s for N-(1-deoxy-d-fructos-1-yl)-l-proline and 10 s for d-fructose at pD 4.20 ± 0.05 at 350 K. In addition, first NMR spectroscopic data are provided, which prove that α-pyranose of (amino acid substituted) d-fructose adopts both, (2)C5 and (5)C2 conformation. PMID:27152632

  15. Structural violence in long-term, residential care for older people: Comparing Canada and Scandinavia

    PubMed Central

    Banerjee, Albert; Daly, Tamara; Armstrong, Pat; Szebehely, Marta; Armstrong, Hugh; LaFrance, Stirling

    2014-01-01

    Canadian frontline careworkers are six times more likely to experience daily physical violence than their Scandinavian counterparts. This paper draws on a comparative survey of residential careworkers serving older people across three Canadian provinces (Manitoba, Nova Scotia, Ontario) and four countries that follow a Scandinavian model of social care (Denmark, Finland, Norway, Sweden) conducted between 2005 and 2006. Ninety percent of Canadian frontline careworkers experienced physical violence from residents or their relatives and 43 percent reported physical violence on a daily basis. Canadian focus groups conducted in 2007 reveal violence was often normalized as an inevitable part of elder-care. We use the concept of “structural violence” (Galtung, 1969) to raise questions about the role that systemic and organizational factors play in setting the context for violence. Structural violence refers to indirect forms of violence that are built into social structures and that prevent people from meeting their basic needs or fulfilling their potential. We applied the concept to long-term residential care and found that the poor quality of the working conditions and inadequate levels of support experienced by Canadian careworkers constitute a form of structural violence. Working conditions are detrimental to careworker’s physical and mental health, and prevent careworkers from providing the quality of care they are capable of providing and understand to be part of their job. These conditions may also contribute to the violence workers experience, and further investigation is warranted. PMID:22204839

  16. Comparative metabolomics and structural characterizations illuminate colibactin pathway-dependent small molecules.

    PubMed

    Vizcaino, Maria I; Engel, Philipp; Trautman, Eric; Crawford, Jason M

    2014-07-01

    The gene cluster responsible for synthesis of the unknown molecule "colibactin" has been identified in mutualistic and pathogenic Escherichia coli. The pathway endows its producer with a long-term persistence phenotype in the human bowel, a probiotic activity used in the treatment of ulcerative colitis, and a carcinogenic activity under host inflammatory conditions. To date, functional small molecules from this pathway have not been reported. Here we implemented a comparative metabolomics and targeted structural network analyses approach to identify a catalog of small molecules dependent on the colibactin pathway from the meningitis isolate E. coli IHE3034 and the probiotic E. coli Nissle 1917. The structures of 10 pathway-dependent small molecules are proposed based on structural characterizations and network relationships. The network will provide a roadmap for the structural and functional elucidation of a variety of other small molecules encoded by the pathway. From the characterized small molecule set, in vitro bacterial growth inhibitory and mammalian CNS receptor antagonist activities are presented. PMID:24932672

  17. Sub Angstrom imaging of dislocation core structures: How well areexperiments comparable with theory?

    SciTech Connect

    Kisielowski, C.; Freitag, B.; Xu, X.; Beckman, S.P.; Chrzan, D.C.

    2005-12-16

    During the past 50 years Transmission Electron Microscopy (TEM) has evolved from an imaging tool to a quantitative method that approaches the ultimate goal of understanding the atomic structure of materials atom by atom in three dimensions both experimentally and theoretically. Today's TEM abilities are tested in the special case of a Ga terminated 30 degree partial dislocation in GaAs:Be where it is shown that a combination of high-resolution phase contrast imaging, Scanning TEM, and local Electron Energy Loss Spectroscopy allows for a complete analysis of dislocation cores and associated stacking faults. We find that it is already possible to locate atom column positions with picometer precision in directly interpretable images of the projected crystal structure and that chemically different elements can already be identified together with their local electronic structure. In terms of theory, the experimental results can be quantitatively compared with ab initio electronic structure total energy calculations. By combining elasticity theory methods with atomic theory an equivalent crystal volume can be addressed. Therefore, it is already feasible to merge experiments and theory on a picometer length scale. While current experiments require the utilization of different, specialized instruments it is foreseeable that the rapid improvement of electron optical elements will soon generate a next generation of microscopes with the ability to image and analyze single atoms in one instrument with deep sub Angstrom spatial resolution and an energy resolution better than 100 meV.

  18. Comparative analysis of the friction stir welded aluminum-magnesium alloy joint grain structure

    NASA Astrophysics Data System (ADS)

    Zaikina, A. A.; Sizova, O. V.; Novitskaya, O. S.

    2015-10-01

    A comparative test of the friction stir welded aluminum-magnesium alloy joint microstructure for plates of a different thickness was carried out. Finding out the structuring regularities in the weld nugget zone, that is the strongest zone of the weld, the effects of temperature-deformational conditions on the promotion of a metal structure refinement mechanism under friction stir welding can be determined. In this research friction stir welded rolled plates of an AMg5M alloy; 5 and 8 mm thick were investigated. Material fine structure pictures of the nugget zone were used to identify and measure subgrain and to define a second phase location. By means of optical microscopy it was shown that the fine-grained structure developed in the nugget zone. The grain size was 5 flm despite the thickness of the plates. In the sample 5.0 mm thick grains were coaxial, while in the sample 8.0 mm thick grains were elongate at a certain angle to the tool travel direction.

  19. A comparative structure-function analysis of active-site inhibitors of Vibrio cholerae cholix toxin.

    PubMed

    Lugo, Miguel R; Merrill, A Rod

    2015-09-01

    Cholix toxin from Vibrio cholerae is a novel mono-ADP-ribosyltransferase (mART) toxin that shares structural and functional properties with Pseudomonas aeruginosa exotoxin A and Corynebacterium diphtheriae diphtheria toxin. Herein, we have used the high-resolution X-ray structure of full-length cholix toxin in the apo form, NAD(+) bound, and 10 structures of the cholix catalytic domain (C-domain) complexed with several strong inhibitors of toxin enzyme activity (NAP, PJ34, and the P-series) to study the binding mode of the ligands. A pharmacophore model based on the active pose of NAD(+) was compared with the active conformation of the inhibitors, which revealed a cationic feature in the side chain of the inhibitors that may determine the active pose. Moreover, a conformational search was conducted for the missing coordinates of one of the main active-site loops (R-loop). The resulting structural models were used to evaluate the interaction energies and for 3D-QSAR modeling. Implications for a rational drug design approach for mART toxins were derived. PMID:25756608

  20. Comparing posttraumatic stress disorder's symptom structure between deployed and nondeployed veterans.

    PubMed

    Engdahl, Ryan M; Elhai, Jon D; Richardson, J Don; Frueh, B Christopher

    2011-03-01

    We tested two empirically validated 4-factor models of posttraumatic stress disorder (PTSD) symptoms using the PTSD Checklist: King, Leskin, King, and Weathers' (1998) model including reexperiencing, avoidance, emotional numbing, and hyperarousal factors, and Simms, Watson, and Doebbeling's (2002) model including reexperiencing, avoidance, dysphoria, and hyperarousal. Our aim was to determine which fit better in two groups of military veterans: peacekeepers previously deployed to a war zone (deployed group) and those trained for peacekeeping operations who were not deployed (nondeployed group). We compared the groups using multigroup confirmatory factor analysis. Adequate model fit was demonstrated among the nondeployed group, with no significant difference between King et al.'s (1998) model (separating avoidance and numbing) and Simms et al.'s (2002) similar model involving a dysphoria factor. A better fitting factor structure consistent with Simms et al.'s (2002) model was found in the deployed group. Comprehensive measurement invariance testing demonstrated significant differences between the deployed and nondeployed groups on all structural parameters, except observed variable intercepts (thus indicating similarities only in PTSD item severity). These findings add to researchers' understanding of PTSD's factor structure, given the revision of PTSD that will appear in the forthcoming 5th edition of the Diagnostic and Statistical Manual of Mental Disorders (American Psychiatric Association, 2010)--namely, that the factor structure may be quite different between groups with and without exposure to major traumatic events. PMID:21171785

  1. Comparative study of two structures of shunt active filter suppressing particular harmonics

    NASA Astrophysics Data System (ADS)

    Benchaita, L.; Salem Nia, A.; Saadate, S.

    1998-07-01

    This paper deals with the study of shunt active filters used for suppressing particular harmonics generated by nonlinear loads in utility distribution power systems. Both structures of shunt active filter, voltage source active filter (VSAF) and current source active filter (CSAF), are considered. The analytical study of specific harmonics identification in a given spectrum is first presented. For simulation as well as experimentation the nonlinear load is a conventional three phase thyristor rectifier and harmonics 5 and 7 are selected to be eliminated by active filter. The whole system consisting of the ac power supply network, the SCR rectifier and the shunt active filter (VSAF/CSAF) is then simulated. The simulation results are discussed and the efficiency of the two kinds of active filter are compared. Finally, for the first structure, VSAF, the simulation results are confirmed by experimental test realized by means of a fully digital control active power filter developed in our laboratory.

  2. [Casting faults and structural studies on bonded alloys comparing centrifugal castings and vacuum pressure castings].

    PubMed

    Fuchs, P; Küfmann, W

    1978-07-01

    The casting processes in use today such as centrifugal casting and vacuum pressure casting were compared with one another. An effort was made to answer the question whether the occurrence of shrink cavities and the mean diameter of the grain of the alloy is dependent on the method of casting. 80 crowns were made by both processes from the baked alloys Degudent Universal, Degudent N and the trial alloy 4437 of the firm Degusa. Slice sections were examined for macro and micro-porosity and the structural appearance was evaluated by linear analysis. Statistical analysis showed that casting faults and casting structure is independent of the method used and their causes must be found in the conditions of casting and the composition of the alloy. PMID:352670

  3. A comparative investigation for the nondestructive testing of honeycomb structures by holographic interferometry and infrared thermography

    NASA Astrophysics Data System (ADS)

    Sfarra, S.; Ibarra-Castanedo, C.; Avdelidis, N. P.; Genest, M.; Bouchagier, L.; Kourousis, D.; Tsimogiannis, A.; Anastassopoulous, A.; Bendada, A.; Maldague, X.; Ambrosini, D.; Paoletti, D.

    2010-03-01

    The nondestructive testing (NDT) of honeycomb sandwich structures has been the subject of several studies. Classical techniques such as ultrasound testing and x-rays are commonly used to inspect these structures. Holographic interferometry (HI) and infrared thermography (IT) have shown to be interesting alternatives. Holography has been successfully used to detect debonding between the skin and the honeycomb core on honeycomb panels under a controlled environment. Active thermography has proven to effectively identify the most common types of defects (water ingress, debonding, crushed core, surface impacts) normally present in aeronautical honeycomb parts while inspecting large surfaces in a fast manner. This is very attractive for both the inspection during the manufacturing process and for in situ regular NDT assessment. A comparative experimental investigation is discussed herein to evaluate the performance of HI and IT for the NDT on a honeycomb panel with fabricated defects. The main advantages and limitations of both techniques are enumerated and discussed.

  4. A comparative study of the structure and cytotoxicity of polytetrafluoroethylene after ion etching and ion implantation

    NASA Astrophysics Data System (ADS)

    Shtansky, D. V.; Glushankova, N. A.; Kiryukhantsev-Korneev, F. V.; Sheveiko, A. N.; Sigarev, A. A.

    2011-03-01

    The ion-plasma treatment has been widely used for modifying the surface structure of polymers in order to improve their properties, but it can lead to destruction of the surface and, as a consequence, to an increase in their toxicity. A comparative study of the structure and cytotoxicity of polytetrafluoroethylene (PTFE) after the ion etching (IE) and ion implantation (II) for 10 min with energy densities of 363 and 226 J/cm2, respectively, has been performed. It has been shown that, unlike the ion implantation, the ion etching results in the destruction of the polymer and in the appearance of the cytotoxicity. The factors responsible for this effect, which are associated with the bulk and surface treatment, as well as with the influence of the temperature, have been discussed.

  5. Comparative structure analysis of non-polar organic ferrofluids stabilized by saturated mono-carboxylic acids.

    PubMed

    Avdeev, M V; Bica, D; Vékás, L; Aksenov, V L; Feoktystov, A V; Marinica, O; Rosta, L; Garamus, V M; Willumeit, R

    2009-06-01

    The structure of ferrofluids (magnetite in decahydronaphtalene) stabilized with saturated mono-carboxylic acids of different chain lengths (lauric, myristic, palmitic and stearic acids) is studied by means of magnetization analysis and small-angle neutron scattering. It is shown that in case of saturated acid surfactants, magnetite nanoparticles are dispersed in the carrier approximately with the same size distribution whose mean value and width are significantly less as compared to the classical stabilization with non-saturated oleic acid. The found thickness of the surfactant shell around magnetite is analyzed with respect to stabilizing properties of mono-carboxylic acids. PMID:19376524

  6. A comparative study of the inner ear structures of artiodactyls and early cetaceans

    SciTech Connect

    Klingshirn, M.A.; Luo, Z.

    1994-12-31

    It has been suggested that the order Cetacea (whales and porpoises) are closely related to artiodactyls, even-hoofed ungulate mammals such as the pig and cow. Paleontological and molecular data strongly supports this concept of phylogenetic relationships. In a study of DNA sequences of two mitochondrial ribosomal gene segments of cetaceans, the artiodactyls were found to be closest related to Cetaceans. These well accepted studies on the phylogenetic affinities of artiodactyls and cetaceans cause us to conduct a comparative study of the bony structure of the inner ear of these two taxa.

  7. Comparing two iteration algorithms of Broyden electron density mixing through an atomic electronic structure computation

    NASA Astrophysics Data System (ADS)

    Man-Hong, Zhang

    2016-05-01

    By performing the electronic structure computation of a Si atom, we compare two iteration algorithms of Broyden electron density mixing in the literature. One was proposed by Johnson and implemented in the well-known VASP code. The other was given by Eyert. We solve the Kohn-Sham equation by using a conventional outward/inward integration of the differential equation and then connect two parts of solutions at the classical turning points, which is different from the method of the matrix eigenvalue solution as used in the VASP code. Compared to Johnson’s algorithm, the one proposed by Eyert needs fewer total iteration numbers. Project supported by the National Natural Science Foundation of China (Grant No. 61176080).

  8. Comparative analysis of the structure and function of adenovirus virus-associated RNAs.

    PubMed Central

    Ma, Y; Mathews, M B

    1993-01-01

    The protein kinase DAI is an important component of the interferon-induced cellular defense mechanism. In cells infected by adenovirus type 2 (Ad2), activation of the kinase is prevented by the synthesis of a small, highly ordered virus-associated (VA) RNA, VA RNAI. The inhibitory function of this RNA depends on its structure, which has been partially elucidated by a combination of mutagenesis and RNase sensitivity analysis. To gain further insight into the structure and function of this regulatory RNA, we have compared the primary sequences, secondary structures, and functions of seven VA RNA species from five human and animal adenoviruses. The sequences exhibit variable degrees of homology, with a particularly close relationship between the VA RNAII species of Ad2 and Ad7 and notably divergent sequence for the avian (CELO) virus VA RNA. Apart from two pairs of mutually complementary tetranucleotides which are highly conserved, homologies are limited to transcription signals located within the RNA sequence and at its termini. Secondary structure analysis indicated that all seven RNAs conform to the model in which VA RNA possesses three main structural regions, a terminal stem, an apical stem-loop, and a central domain, although these elements vary in size and other details. The apical stem is implicated in binding to DAI, and the central domain is essential for inhibition of DAI activation. One of the pairs of conserved tetranucleotides (CCGG:C/UCGG) provides further evidence for the existence of the apical stem, but the other conserved pair (GGGU:ACCC) strongly suggests a revised structure for the central domain. In two functional assays conducted in vivo, the VA RNAI species of Ad2 and Ad7 were the most active, their corresponding VA RNAII species displayed little activity, and the single VA RNAs of Ad12 and simian adenovirus type 7 exhibited intermediate activity. Correlation of the structural and functional data suggests that the VA RNAII species adopt a

  9. Structural investigation of the alpha-1-antichymotrypsin: prostate-specific antigen complex by comparative model building.

    PubMed Central

    Villoutreix, B. O.; Lilja, H.; Pettersson, K.; Lövgren, T.; Teleman, O.

    1996-01-01

    Prostate-specific antigen (PSA), produced by prostate cells, provides an excellent serum marker for prostate cancer. It belongs to the human kallikrein family of enzymes, a second prostate-derived member of which is human glandular kallikrein-1 (hK2). Active PSA and hK2 are both 237-residue kallikrein-like proteases, based on sequence homology. An hK2 model structure based on the serine protease fold is presented and compared to PSA and six other serine proteases in order to analyze in depth the role of the surface-accessible loops surrounding the active site. The results show that PSA and hK2 share extensive structural similarity and that most amino acid replacements are centered on the loops surrounding the active site. Furthermore, the electrostatic potential surfaces are very similar for PSA and hK2. PSA interacts with at least two serine protease inhibitors (serpins): alpha-1-antichymotrypsin (ACT) and protein C inhibitor (PCI). Three-dimensional model structures of the uncleaved ACT molecule were developed based upon the recent X-ray structure of uncleaved antithrombin. The serpin was docked both to PSA and hK2. Amino acid replacements and electrostatic complementarities indicate that the overall orientation of the proteins in these complexes is reasonable. In order to investigate PSA's heparin interaction sites, electrostatic computations were carried out on PSA, hK2, protein C, ACT, and PCI. Two heparin binding sites are suggested on the PSA surface and could explain the enhanced complex formation between PSA and PCI, while inhibiting the formation of the ACT-PSA complex, PSA, hK2, and their preliminary complexes with ACT should facilitate the understanding and prediction of structural and functional properties for these important proteins also with respect to prostate diseases. PMID:8732755

  10. Comparative Analysis of Data Structures for Storing Massive Tins in a Dbms

    NASA Astrophysics Data System (ADS)

    Kumar, K.; Ledoux, H.; Stoter, J.

    2016-06-01

    Point cloud data are an important source for 3D geoinformation. Modern day 3D data acquisition and processing techniques such as airborne laser scanning and multi-beam echosounding generate billions of 3D points for simply an area of few square kilometers. With the size of the point clouds exceeding the billion mark for even a small area, there is a need for their efficient storage and management. These point clouds are sometimes associated with attributes and constraints as well. Storing billions of 3D points is currently possible which is confirmed by the initial implementations in Oracle Spatial SDO PC and the PostgreSQL Point Cloud extension. But to be able to analyse and extract useful information from point clouds, we need more than just points i.e. we require the surface defined by these points in space. There are different ways to represent surfaces in GIS including grids, TINs, boundary representations, etc. In this study, we investigate the database solutions for the storage and management of massive TINs. The classical (face and edge based) and compact (star based) data structures are discussed at length with reference to their structure, advantages and limitations in handling massive triangulations and are compared with the current solution of PostGIS Simple Feature. The main test dataset is the TIN generated from third national elevation model of the Netherlands (AHN3) with a point density of over 10 points/m2. PostgreSQL/PostGIS DBMS is used for storing the generated TIN. The data structures are tested with the generated TIN models to account for their geometry, topology, storage, indexing, and loading time in a database. Our study is useful in identifying what are the limitations of the existing data structures for storing massive TINs and what is required to optimise these structures for managing massive triangulations in a database.

  11. Comparing the performance of different model structures with respect to different hydrological signatures

    NASA Astrophysics Data System (ADS)

    Euser, T.; Winsemius, H. C.; Hrachowitz, M.; Fenicia, F.; Savenije, H. H. G.

    2012-04-01

    Correctly representing the dominant flow generation processes in conceptual rainfall-runoff models is crucial for ensuring adequate predictive power of the models. Recent work showed that on the small scale uniqueness of place requires different model structures for different catchments and that different calibration strategies frequently result in a wide range of model parameter sets. In this study we investigate the following research questions: (1) What is the effect of different calibration objective functions on the model performance? (2) Can the difference in performance of specific objective functions be related to hydrological signatures and physical catchment characteristics. Data from four experimental (approx. 1000 km2) sub-catchments (Alzette, Kyll, Orne and Seille) of the Moselle were used in this study. Eleven conceptual model structures (HBV, GR4J and 9 SUPERFLEX (flexible) model structures) of varying level of complexity are applied on each of the four study catchments. Besides classical objective functions (eg. Nash-Sutcliffe efficiency), additional objective functions are defined based on several hydrological signatures, such as the flow duration curve, rising limb density and auto-correlation. A multi-objective optimization is performed on all the objective functions for each catchment and each model structure considered. The results of the multi-objective optimization are then compared using Principle Component Analysis in order to identify the causes for differences in performance in the objective functions and relate these to physical catchment characteristics such as elevation, shape of the catchment and the height distribution above the nearest drain within a catchment. If such relationships are found then they can help to a priori identify suitable model structures and hydrological signatures in a catchment, given its spatial scale and physical characteristics.

  12. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics

    PubMed Central

    2013-01-01

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users’ requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios. For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature. We employed several UCD techniques, including: persona development, interviews, ‘canvas sort’ card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience. PMID:23514033

  13. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics.

    PubMed

    de Matos, Paula; Cham, Jennifer A; Cao, Hong; Alcántara, Rafael; Rowland, Francis; Lopez, Rodrigo; Steinbeck, Christoph

    2013-01-01

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users' requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios.For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature.We employed several UCD techniques, including: persona development, interviews, 'canvas sort' card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience. PMID:23514033

  14. Biclustering Bioinformatics Data Sets:. a Possibilistic Approach

    NASA Astrophysics Data System (ADS)

    Masulli, Francesco

    2007-12-01

    The analysis of genomic data from DNA microarray can produce a valuable information on the biological relevance of genes and on correlations among them. In the last few years some biclustering techniques have been proposed and applied to this analysis. Biclustering is a learning task for finding clusters of samples possessing similar characteristics together with features creating these similarities. When applied to genomic data it can allow us to identify genes with similar behavior with respect to different conditions. In this paper a new approach to the biclustering problem will be introduced extending the Possibilistic Clustering paradigm. The proposed Possibilistic Biclustering algorithm finds one bicluster at a time, assigning a membership to the bicluster for each gene and for each condition. Some results on oligonucleotide microarray data sets will be presented and compared with those obtained using other biclustering methods.

  15. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    NASA Astrophysics Data System (ADS)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  16. Physcomitrella HMGA-type proteins display structural differences compared to their higher plant counterparts

    SciTech Connect

    Lyngaard, Carina; Stemmer, Christian; Stensballe, Allan; Graf, Manuela; Gorr, Gilbert; Decker, Eva; Grasser, Klaus D.

    2008-10-03

    High mobility group (HMG) proteins of the HMGA family are chromatin-associated proteins that act as architectural factors in nucleoprotein structures involved in gene transcription. To date, HMGA-type proteins have been studied in various higher plant species, but not in lower plants. We have identified two HMGA-type proteins, HMGA1 and HMGA2, encoded in the genome of the moss model Physcomitrella patens. Compared to higher plant HMGA proteins, the two Physcomitrella proteins display some structural differences. Thus, the moss HMGA proteins have six (rather than four) AT-hook DNA-binding motifs and their N-terminal domain lacks similarity to linker histone H1. HMGA2 is expressed in moss protonema and it localises to the cell nucleus. Typical of HMGA proteins, HMGA2 interacts preferentially with A/T-rich DNA, when compared with G/C-rich DNA. In cotransformation assays in Physcomitrella protoplasts, HMGA2 stimulated reporter gene expression. In summary, our data show that functional HMGA-type proteins occur in Physcomitrella.

  17. Comparative study of local structure of two cyanobiphenyl liquid crystals by molecular dynamics method

    SciTech Connect

    Gerts, Egor D. Komolkin, Andrei V.; Burmistrov, Vladimir A.; Alexandriysky, Victor V.; Dvinskikh, Sergey V.

    2014-08-21

    Fully-atomistic molecular dynamics simulations were carried out on two similar cyanobiphenyl nematogens, HO-6OCB and 7OCB, in order to study effects of hydrogen bonds on local structure of liquid crystals. Comparable length of these two molecules provides more evident results on the effects of hydrogen bonding. The analysis of radial and cylindrical distribution functions clearly shows the differences in local structure of two mesogens. The simulations showed that anti-parallel alignment is preferable for the HO-6OCB. Hydrogen bonds between OH-groups are observed for 51% of HO-6OCB molecules, while hydrogen bonding between CN- and OH-groups occurs only for 16% of molecules. The lifetimes of H-bonds differ due to different mobility of molecular fragments (50 ps for N⋅⋅⋅H–O and 41 ps for O⋅⋅⋅H–O). Although the standard Optimized Potentials for Liquid Simulations - All-Atom force field cannot reproduce some experimental parameters quantitatively (order parameters are overestimated, diffusion coefficients are not reproduced well), the comparison of relative simulated results for the pair of mesogens is nevertheless consistent with the same relative experimental parameters. Thus, the comparative study of simulated and experimental results for the pair of similar liquid crystals still can be assumed plausible.

  18. The leader peptide of mutacin 1140 has distinct structural components compared to related class I lantibiotics

    PubMed Central

    Escano, Jerome; Stauffer, Byron; Brennan, Jacob; Bullock, Monica; Smith, Leif

    2014-01-01

    Lantibiotics are ribosomally synthesized peptide antibiotics composed of an N-terminal leader peptide that promotes the core peptide's interaction with the post translational modification (PTM) enzymes. Following PTMs, mutacin 1140 is transported out of the cell and the leader peptide is cleaved to yield the antibacterial peptide. Mutacin 1140 leader peptide is structurally unique compared to other class I lantibiotic leader peptides. Herein, we further our understanding of the structural differences of mutacin 1140 leader peptide with regard to other class I leader peptides. We have determined that the length of the leader peptide is important for the biosynthesis of mutacin 1140. We have also determined that mutacin 1140 leader peptide contains a novel four amino acid motif compared to related lantibiotics. PTM enzyme recognition of the leader peptide appears to be evolutionarily distinct from related class I lantibiotics. Our study on mutacin 1140 leader peptide provides a basis for future studies aimed at understanding its interaction with the PTM enzymes. PMID:25400246

  19. A graph-theoretic algorithm for comparative modeling of protein structure.

    PubMed

    Samudrala, R; Moult, J

    1998-05-29

    The interconnected nature of interactions in protein structures appears to be the major hurdle in preventing the construction of accurate comparative models. We present an algorithm that uses graph theory to handle this problem. Each possible conformation of a residue in an amino acid sequence is represented using the notion of a node in a graph. Each node is given a weight based on the degree of the interaction between its side-chain atoms and the local main-chain atoms. Edges are then drawn between pairs of residue conformations/nodes that are consistent with each other (i.e. clash-free and satisfying geometrical constraints). The edges are weighted based on the interactions between the atoms of the two nodes. Once the entire graph is constructed, all the maximal sets of completely connected nodes (cliques) are found using a clique-finding algorithm. The cliques with the best weights represent the optimal combinations of the various main-chain and side-chain possibilities, taking the respective environments into account. The algorithm is used in a comparative modeling scenario to build side-chains, regions of main chain, and mix and match between different homologs in a context-sensitive manner. The predictive power of this method is assessed by applying it to cases where the experimental structure is not known in advance. PMID:9636717

  20. Meeting Review: 2002 O'Reilly Bioinformatics Technology Conference

    PubMed Central

    2002-01-01

    At the end of January I travelled to the States to speak at and attend the first O’Reilly Bioinformatics Technology Conference [14]. It was a large, well-organized and diverse meeting with an interesting history. Although the meeting was not a typical academic conference, its style will, I am sure, become more typical of meetings in both biological and computational sciences. Speakers at the event included prominent bioinformatics researchers such as Ewan Birney, Terry Gaasterland and Lincoln Stein; authors and leaders in the open source programming community like Damian Conway and Nat Torkington; and representatives from several publishing companies including the Nature Publishing Group, Current Science Group and the President of O’Reilly himself, Tim O’Reilly. There were presentations, tutorials, debates, quizzes and even a ‘jam session’ for musical bioinformaticists. PMID:18628852

  1. A review of estimation of distribution algorithms in bioinformatics

    PubMed Central

    Armañanzas, Rubén; Inza, Iñaki; Santana, Roberto; Saeys, Yvan; Flores, Jose Luis; Lozano, Jose Antonio; Peer, Yves Van de; Blanco, Rosa; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro

    2008-01-01

    Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. PMID:18822112

  2. Rise and Demise of Bioinformatics? Promise and Progress

    PubMed Central

    Ouzounis, Christos A.

    2012-01-01

    The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself. PMID:22570600

  3. A survey on evolutionary algorithm based hybrid intelligence in bioinformatics.

    PubMed

    Li, Shan; Kang, Liying; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks. PMID:24729969

  4. Personalized medicine: challenges and opportunities for translational bioinformatics

    PubMed Central

    Overby, Casey Lynnette; Tarczy-Hornoch, Peter

    2013-01-01

    Personalized medicine can be defined broadly as a model of healthcare that is predictive, personalized, preventive and participatory. Two US President’s Council of Advisors on Science and Technology reports illustrate challenges in personalized medicine (in a 2008 report) and in use of health information technology (in a 2010 report). Translational bioinformatics is a field that can help address these challenges and is defined by the American Medical Informatics Association as “the development of storage, analytic and interpretive methods to optimize the transformation of increasing voluminous biomedical data into proactive, predictive, preventative and participatory health.” This article discusses barriers to implementing genomics applications and current progress toward overcoming barriers, describes lessons learned from early experiences of institutions engaged in personalized medicine and provides example areas for translational bioinformatics research inquiry. PMID:24039624

  5. Bioinformatics tools for small genomes, such as hepatitis B virus.

    PubMed

    Bell, Trevor G; Kramvis, Anna

    2015-02-01

    DNA sequence analysis is undertaken in many biological research laboratories. The workflow consists of several steps involving the bioinformatic processing of biological data. We have developed a suite of web-based online bioinformatic tools to assist with processing, analysis and curation of DNA sequence data. Most of these tools are genome-agnostic, with two tools specifically designed for hepatitis B virus sequence data. Tools in the suite are able to process sequence data from Sanger sequencing, ultra-deep amplicon resequencing (pyrosequencing) and chromatograph (trace files), as appropriate. The tools are available online at no cost and are aimed at researchers without specialist technical computer knowledge. The tools can be accessed at http://hvdr.bioinf.wits.ac.za/SmallGenomeTools, and the source code is available online at https://github.com/DrTrevorBell/SmallGenomeTools. PMID:25690798

  6. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    PubMed Central

    Li, Shan; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks. PMID:24729969

  7. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  8. Comprehensive bioinformatics analysis of cell-free protein synthesis: identification of multiple protein properties that correlate with successful expression.

    PubMed

    Kurotani, Atsushi; Takagi, Tetsuo; Toyama, Mitsutoshi; Shirouzu, Mikako; Yokoyama, Shigeyuki; Fukami, Yasuo; Tokmakov, Alexander A

    2010-04-01

    High-throughput cell-free protein synthesis is being used increasingly in structural/functional genomics projects. However, the factors determining expression success are poorly understood. Here, we evaluated the expression of 3066 human proteins and their domains in a bacterial cell-free system and analyzed the correlation of protein expression with 39 physicochemical and structural properties of proteins. As a result of the bioinformatics analysis performed, we determined the 18 most influential features that affect protein amenability to cell-free expression. They include protein length; hydrophobicity; pI; content of charged, nonpolar, and aromatic residues;, cysteine content; solvent accessibility; presence of coiled coil; content of intrinsically disordered and structured (alpha-helix and beta-sheet) sequence; number of disulfide bonds and functional domains; presence of transmembrane regions; PEST motifs; and signaling sequences. This study represents the first comprehensive bioinformatics analysis of heterologous protein synthesis in a cell-free system. The rules and correlations revealed here provide a plethora of important insights into rationalization of cell-free protein production and can be of practical use for protein engineering with the aim of increasing expression success.-Kurotani, A., Takagi, T., Toyama, M., Shirouzu, M., Yokoyama, S., Fukami, Y., Tokmakov, A. A. Comprehensive bioinformatics analysis of cell-free protein synthesis: identification of multiple protein properties that correlate with successful expression. PMID:19940260

  9. Bioinformatics for precision medicine in oncology: principles and application to the SHIVA clinical trial

    PubMed Central

    Servant, Nicolas; Roméjon, Julien; Gestraud, Pierre; La Rosa, Philippe; Lucotte, Georges; Lair, Séverine; Bernard, Virginie; Zeitouni, Bruno; Coffin, Fanny; Jules-Clément, Gérôme; Yvon, Florent; Lermine, Alban; Poullet, Patrick; Liva, Stéphane; Pook, Stuart; Popova, Tatiana; Barette, Camille; Prud’homme, François; Dick, Jean-Gabriel; Kamal, Maud; Le Tourneau, Christophe; Barillot, Emmanuel; Hupé, Philippe

    2014-01-01

    Precision medicine (PM) requires the delivery of individually adapted medical care based on the genetic characteristics of each patient and his/her tumor. The last decade witnessed the development of high-throughput technologies such as microarrays and next-generation sequencing which paved the way to PM in the field of oncology. While the cost of these technologies decreases, we are facing an exponential increase in the amount of data produced. Our ability to use this information in daily practice relies strongly on the availability of an efficient bioinformatics system that assists in the translation of knowledge from the bench towards molecular targeting and diagnosis. Clinical trials and routine diagnoses constitute different approaches, both requiring a strong bioinformatics environment capable of (i) warranting the integration and the traceability of data, (ii) ensuring the correct processing and analyses of genomic data, and (iii) applying well-defined and reproducible procedures for workflow management and decision-making. To address the issues, a seamless information system was developed at Institut Curie which facilitates the data integration and tracks in real-time the processing of individual samples. Moreover, computational pipelines were developed to identify reliably genomic alterations and mutations from the molecular profiles of each patient. After a rigorous quality control, a meaningful report is delivered to the clinicians and biologists for the therapeutic decision. The complete bioinformatics environment and the key points of its implementation are presented in the context of the SHIVA clinical trial, a multicentric randomized phase II trial comparing targeted therapy based on tumor molecular profiling versus conventional therapy in patients with refractory cancer. The numerous challenges faced in practice during the setting up and the conduct of this trial are discussed as an illustration of PM application. PMID:24910641

  10. Comparative Evaluation for Brain Structural Connectivity Approaches: Towards Integrative Neuroinformatics Tool for Epilepsy Clinical Research.

    PubMed

    Yang, Sheng; Tatsuoka, Curtis; Ghosh, Kaushik; Lacuey-Lecumberri, Nuria; Lhatoo, Samden D; Sahoo, Satya S

    2016-01-01

    Recent advances in brain fiber tractography algorithms and diffusion Magnetic Resonance Imaging (MRI) data collection techniques are providing new approaches to study brain white matter connectivity, which play an important role in complex neurological disorders such as epilepsy. Epilepsy affects approximately 50 million persons worldwide and it is often described as a disorder of the cortical network organization. There is growing recognition of the need to better understand the role of brain structural networks in the onset and propagation of seizures in epilepsy using high resolution non-invasive imaging technologies. In this paper, we perform a comparative evaluation of two techniques to compute structural connectivity, namely probabilistic fiber tractography and statistics derived from fractional anisotropy (FA), using diffusion MRI data from a patient with rare case of medically intractable insular epilepsy. The results of our evaluation demonstrate that probabilistic fiber tractography provides a more accurate map of structural connectivity and may help address inherent complexities of neural fiber layout in the brain, such as fiber crossings. This work provides an initial result towards building an integrative informatics tool for neuroscience that can be used to accurately characterize the role of fiber tract connectivity in neurological disorders such as epilepsy. PMID:27570685

  11. Comparative Evaluation for Brain Structural Connectivity Approaches: Towards Integrative Neuroinformatics Tool for Epilepsy Clinical Research

    PubMed Central

    Yang, Sheng; Tatsuoka, Curtis; Ghosh, Kaushik; Lacuey-Lecumberri, Nuria; Lhatoo, Samden D.; Sahoo, Satya S.

    2016-01-01

    Recent advances in brain fiber tractography algorithms and diffusion Magnetic Resonance Imaging (MRI) data collection techniques are providing new approaches to study brain white matter connectivity, which play an important role in complex neurological disorders such as epilepsy. Epilepsy affects approximately 50 million persons worldwide and it is often described as a disorder of the cortical network organization. There is growing recognition of the need to better understand the role of brain structural networks in the onset and propagation of seizures in epilepsy using high resolution non-invasive imaging technologies. In this paper, we perform a comparative evaluation of two techniques to compute structural connectivity, namely probabilistic fiber tractography and statistics derived from fractional anisotropy (FA), using diffusion MRI data from a patient with rare case of medically intractable insular epilepsy. The results of our evaluation demonstrate that probabilistic fiber tractography provides a more accurate map of structural connectivity and may help address inherent complexities of neural fiber layout in the brain, such as fiber crossings. This work provides an initial result towards building an integrative informatics tool for neuroscience that can be used to accurately characterize the role of fiber tract connectivity in neurological disorders such as epilepsy. PMID:27570685

  12. Comparative Structural and Functional Analysis of Bunyavirus and Arenavirus Cap-Snatching Endonucleases

    PubMed Central

    Reguera, Juan; Gerlach, Piotr; Rosenthal, Maria; Gaudon, Stephanie; Coscia, Francesca; Günther, Stephan; Cusack, Stephen

    2016-01-01

    Segmented negative strand RNA viruses of the arena-, bunya- and orthomyxovirus families uniquely carry out viral mRNA transcription by the cap-snatching mechanism. This involves cleavage of host mRNAs close to their capped 5′ end by an endonuclease (EN) domain located in the N-terminal region of the viral polymerase. We present the structure of the cap-snatching EN of Hantaan virus, a bunyavirus belonging to hantavirus genus. Hantaan EN has an active site configuration, including a metal co-ordinating histidine, and nuclease activity similar to the previously reported La Crosse virus and Influenza virus ENs (orthobunyavirus and orthomyxovirus respectively), but is more active in cleaving a double stranded RNA substrate. In contrast, Lassa arenavirus EN has only acidic metal co-ordinating residues. We present three high resolution structures of Lassa virus EN with different bound ion configurations and show in comparative biophysical and biochemical experiments with Hantaan, La Crosse and influenza ENs that the isolated Lassa EN is essentially inactive. The results are discussed in the light of EN activation mechanisms revealed by recent structures of full-length influenza virus polymerase. PMID:27304209

  13. Comparative Structural and Functional Analysis of Bunyavirus and Arenavirus Cap-Snatching Endonucleases.

    PubMed

    Reguera, Juan; Gerlach, Piotr; Rosenthal, Maria; Gaudon, Stephanie; Coscia, Francesca; Günther, Stephan; Cusack, Stephen

    2016-06-01

    Segmented negative strand RNA viruses of the arena-, bunya- and orthomyxovirus families uniquely carry out viral mRNA transcription by the cap-snatching mechanism. This involves cleavage of host mRNAs close to their capped 5' end by an endonuclease (EN) domain located in the N-terminal region of the viral polymerase. We present the structure of the cap-snatching EN of Hantaan virus, a bunyavirus belonging to hantavirus genus. Hantaan EN has an active site configuration, including a metal co-ordinating histidine, and nuclease activity similar to the previously reported La Crosse virus and Influenza virus ENs (orthobunyavirus and orthomyxovirus respectively), but is more active in cleaving a double stranded RNA substrate. In contrast, Lassa arenavirus EN has only acidic metal co-ordinating residues. We present three high resolution structures of Lassa virus EN with different bound ion configurations and show in comparative biophysical and biochemical experiments with Hantaan, La Crosse and influenza ENs that the isolated Lassa EN is essentially inactive. The results are discussed in the light of EN activation mechanisms revealed by recent structures of full-length influenza virus polymerase. PMID:27304209

  14. Functional silicene and stanene nanoribbons compared to graphene: electronic structure and transport

    NASA Astrophysics Data System (ADS)

    van den Broek, B.; Houssa, M.; Iordanidou, K.; Pourtois, G.; Afanas'ev, V. V.; Stesmans, A.

    2016-03-01

    Since the advent of graphene, other 2D materials have garnered interest; notably the single element materials silicene, germanene, and stanene. We investigate the ballistic current-voltage (I-V) characteristics of armchair silicene and stanene armchair nanoribbons (AXNRs with X = Si, Sn) using a combination of density functional theory and non-equilibrium Green’s functions. The impact of out-of-plane electric field and in-plane uniaxial strain on the ribbon geometries, electronic structure, and (I-V)s are considered and contrasted with graphene. Since silicene and stanene are sp2/sp3 buckled layers, the electronic structure can be tuned by an electric field that breaks the sublattice symmetry, an effect absent in graphene. This decreases the current by ˜50% for Sn, since it has the largest buckling. Uniaxial straining of the ballistic channel affects the AXNR electronic structure in multiple ways: it changes the bandgap and associated effective carrier mass, and creates a local buckling distortion at the lead-channel interface which induces a interface dipole. Due to the increasing sp3 hybridization character with increasing element mass, large reconstructions rectify the strained systems, an effect absent in sp2 bonded graphene. This results in a smaller strain effect on the current: a decrease of 20% for Sn at 15% tensile strain compared to a ˜75% decrease for C.

  15. A Quick Guide for Building a Successful Bioinformatics Community

    PubMed Central

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-01-01

    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  16. A quick guide for building a successful bioinformatics community.

    PubMed

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-02-01

    "Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  17. Integration of bioinformatics into an undergraduate biology curriculum and the impact on development of mathematical skills.

    PubMed

    Wightman, Bruce; Hark, Amy T

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative. PMID:22987552

  18. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    PubMed Central

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  19. Bioinformatics: Current practice and future challenges for life science education.

    PubMed

    Hack, Catherine; Kendall, Gary

    2005-03-01

    It is widely predicted that the application of high-throughput technologies to the quantification and identification of biological molecules will cause a paradigm shift in the life sciences. However, if the biosciences are to evolve from a predominantly descriptive discipline to an information science, practitioners will require enhanced skills in mathematics, computing, and statistical analysis. Universities have responded to the widely perceived skills gap primarily by developing masters programs in bioinformatics, resulting in a rapid expansion in the provision of postgraduate bioinformatics education. There is, however, a clear need to improve the quantitative and analytical skills of life science undergraduates. This article reviews the response of academia in the United Kingdom and proposes the learning outcomes that graduates should achieve to cope with the new biology. While the analysis discussed here uses the development of bioinformatics education in the United Kingdom as an illustrative example, it is hoped that the issues raised will resonate with all those involved in curriculum development in the life sciences. PMID:21638550

  20. Using Grid technology for computationally intensive applied bioinformatics analyses.

    PubMed

    Andrade, Jorge; Berglund, Lisa; Uhlén, Mathias; Odeberg, Jacob

    2006-01-01

    For several applications and algorithms used in applied bioinformatics, a bottle neck in terms of computational time may arise when scaled up to facilitate analyses of large datasets and databases. Re-codification, algorithm modification or sacrifices in sensitivity and accuracy may be necessary to accommodate for limited computational capacity of single work stations. Grid computing offers an alternative model for solving massive computational problems by parallel execution of existing algorithms and software implementations. We present the implementation of a Grid-aware model for solving computationally intensive bioinformatic analyses exemplified by a blastp sliding window algorithm for whole proteome sequence similarity analysis, and evaluate the performance in comparison with a local cluster and a single workstation. Our strategy involves temporary installations of the BLAST executable and databases on remote nodes at submission, accommodating for dynamic Grid environments as it avoids the need of predefined runtime environments (preinstalled software and databases at specific Grid-nodes). Importantly, the implementation is generic where the BLAST executable can be replaced by other software tools to facilitate analyses suitable for parallelisation. This model should be of general interest in applied bioinformatics. Scripts and procedures are freely available from the authors. PMID:17518760

  1. Best practices in bioinformatics training for life scientists.

    PubMed

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  2. SNPTrack™ : an integrated bioinformatics system for genetic association studies.

    PubMed

    Xu, Joshua; Kelly, Reagan; Zhou, Guangxu; Turner, Steven A; Ding, Don; Harris, Stephen C; Hong, Huixiao; Fang, Hong; Tong, Weida

    2012-01-01

    A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA) to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP) annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm. PMID:23245293

  3. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    PubMed

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  4. A primer to frequent itemset mining for bioinformatics.

    PubMed

    Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart; Laukens, Kris

    2015-03-01

    Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. PMID:24162173

  5. Best practices in bioinformatics training for life scientists

    PubMed Central

    Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D.; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L.; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C.; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K.

    2013-01-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  6. A primer to frequent itemset mining for bioinformatics

    PubMed Central

    Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart

    2015-01-01

    Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. PMID:24162173

  7. Cloning, Nucleotide Sequencing and Bioinformatics Study of NcSRS2 Gene, an Immunogen from Iranian Isolate of Neospora caninum

    PubMed Central

    Soltani, M; Sadrebazzaz, A; Nassiri, M; Tahmoorespoor, M

    2013-01-01

    Background Neosporosis is caused by an obligate intracellular parasitic protozoa Neospora caninum which infect variety of hosts. NcSRS2 is an immuno-dominant antigen of N. caninum which is considered as one of the most promising targets for a recombinant or DNA vaccine against neosporosis. As no study has been carried out to identify the molecular structure of N. caninum in Iran, as first step, we prepared a scheme to identify this gene in this parasite in Iran. Methods Tachyzoite total RNA was extracted and cDNA was synthesized and NcSRS2 gene was amplified using cDNA as template. Then the PCR product was cloned into pTZ57R/T vector and transformed into E. coli (DH5α strain). Finally, the recombinant plasmid was extracted from transformed E. coli and sequenced. Bioinformatics analysis also carried out. Results The PCR product of NcSRS2 gene was sequenced and recorded in GenBank. The deduced amino acid sequence of NcSRS2 in current study was compared with other N. caninum NcSRS2 and showed some identities and differences. Conclusion NcSRS2 gene of N. caninum successfully cloned in pTZ57R/T. Recombinant plasmid was confirmed by sequencing, colony PCR and enzymatic digestion. It is ready to express recombinant protein for further studies. PMID:23682269

  8. The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity

    PubMed Central

    Ziemert, Nadine; Podell, Sheila; Penn, Kevin; Badger, Jonathan H.; Allen, Eric; Jensen, Paul R.

    2012-01-01

    New bioinformatic tools are needed to analyze the growing volume of DNA sequence data. This is especially true in the case of secondary metabolite biosynthesis, where the highly repetitive nature of the associated genes creates major challenges for accurate sequence assembly and analysis. Here we introduce the web tool Natural Product Domain Seeker (NaPDoS), which provides an automated method to assess the secondary metabolite biosynthetic gene diversity and novelty of strains or environments. NaPDoS analyses are based on the phylogenetic relationships of sequence tags derived from polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) genes, respectively. The sequence tags correspond to PKS-derived ketosynthase domains and NRPS-derived condensation domains and are compared to an internal database of experimentally characterized biosynthetic genes. NaPDoS provides a rapid mechanism to extract and classify ketosynthase and condensation domains from PCR products, genomes, and metagenomic datasets. Close database matches provide a mechanism to infer the generalized structures of secondary metabolites while new phylogenetic lineages provide targets for the discovery of new enzyme architectures or mechanisms of secondary metabolite assembly. Here we outline the main features of NaPDoS and test it on four draft genome sequences and two metagenomic datasets. The results provide a rapid method to assess secondary metabolite biosynthetic gene diversity and richness in organisms or environments and a mechanism to identify genes that may be associated with uncharacterized biochemistry. PMID:22479523

  9. Productivity and salinity structuring of the microplankton revealed by comparative freshwater metagenomics.

    PubMed

    Eiler, Alexander; Zaremba-Niedzwiedzka, Katarzyna; Martínez-García, Manuel; McMahon, Katherine D; Stepanauskas, Ramunas; Andersson, Siv G E; Bertilsson, Stefan

    2014-09-01

    Little is known about the diversity and structuring of freshwater microbial communities beyond the patterns revealed by tracing their distribution in the landscape with common taxonomic markers such as the ribosomal RNA. To address this gap in knowledge, metagenomes from temperate lakes were compared to selected marine metagenomes. Taxonomic analyses of rRNA genes in these freshwater metagenomes confirm the previously reported dominance of a limited subset of uncultured lineages of freshwater bacteria, whereas Archaea were rare. Diversification into marine and freshwater microbial lineages was also reflected in phylogenies of functional genes, and there were also significant differences in functional beta-diversity. The pathways and functions that accounted for these differences are involved in osmoregulation, active transport, carbohydrate and amino acid metabolism. Moreover, predicted genes orthologous to active transporters and recalcitrant organic matter degradation were more common in microbial genomes from oligotrophic versus eutrophic lakes. This comparative metagenomic analysis allowed us to formulate a general hypothesis that oceanic- compared with freshwater-dwelling microorganisms, invest more in metabolism of amino acids and that strategies of carbohydrate metabolism differ significantly between marine and freshwater microbial communities. PMID:24118837

  10. Productivity and salinity structuring of the microplankton revealed by comparative freshwater metagenomics

    PubMed Central

    Eiler, Alexander; Zaremba-Niedzwiedzka, Katarzyna; Martínez-García, Manuel; McMahon, Katherine D; Stepanauskas, Ramunas; Andersson, Siv G E; Bertilsson, Stefan

    2014-01-01

    Little is known about the diversity and structuring of freshwater microbial communities beyond the patterns revealed by tracing their distribution in the landscape with common taxonomic markers such as the ribosomal RNA. To address this gap in knowledge, metagenomes from temperate lakes were compared to selected marine metagenomes. Taxonomic analyses of rRNA genes in these freshwater metagenomes confirm the previously reported dominance of a limited subset of uncultured lineages of freshwater bacteria, whereas Archaea were rare. Diversification into marine and freshwater microbial lineages was also reflected in phylogenies of functional genes, and there were also significant differences in functional beta-diversity. The pathways and functions that accounted for these differences are involved in osmoregulation, active transport, carbohydrate and amino acid metabolism. Moreover, predicted genes orthologous to active transporters and recalcitrant organic matter degradation were more common in microbial genomes from oligotrophic versus eutrophic lakes. This comparative metagenomic analysis allowed us to formulate a general hypothesis that oceanic- compared with freshwater-dwelling microorganisms, invest more in metabolism of amino acids and that strategies of carbohydrate metabolism differ significantly between marine and freshwater microbial communities. PMID:24118837

  11. Freshwater Metaviromics and Bacteriophages: A Current Assessment of the State of the Art in Relation to Bioinformatic Challenges

    PubMed Central

    Bruder, Katherine; Malki, Kema; Cooper, Alexandria; Sible, Emily; Shapiro, Jason W.; Watkins, Siobhan C.; Putonti, Catherine

    2016-01-01

    Advances in bioinformatics and sequencing technologies have allowed for the analysis of complex microbial communities at an unprecedented rate. While much focus is often placed on the cellular members of these communities, viruses play a pivotal role, particularly bacteria-infecting viruses (bacteriophages); phages mediate global biogeochemical processes and drive microbial evolution through bacterial grazing and horizontal gene transfer. Despite their importance and ubiquity in nature, very little is known about the diversity and structure of viral communities. Though the need for culture-based methods for viral identification has been somewhat circumvented through metagenomic techniques, the analysis of metaviromic data is marred with many unique issues. In this review, we examine the current bioinformatic approaches for metavirome analyses and the inherent challenges facing the field as illustrated by the ongoing efforts in the exploration of freshwater phage populations. PMID:27375355

  12. Comparing investigations on the surface structures of irghizites and pyroclastics by SEM

    NASA Astrophysics Data System (ADS)

    Heide, K.; Volksch, G.; Florenski, P. W.

    1982-03-01

    An electron microscope was used in an investigation which compared the surface structures of irghizites from the Zhamanshin crater in Kazachstan, USSR, with those of such typical tektites as australites and pyroclastics such as obsidians and lapilli. The results indicate no unambiguous genetic relationships between irghizite morphology and tektite and pyroclastic surface features. Irghizite surfaces instead result from simultaneous or successive processes in the course of which variously-dimensioned globules melted, fused and were eaten into by corrosive gases after solidification. The assumption that the verrucose glass globule swellings, which are identical in chemical composition to the glass bulk of the irghizites, were caused by expanding glass bubbles immediately below the glass bulk surface may be discounted.

  13. Comparing of Normal Stress Distribution in Static and Dynamic Soil-Structure Interaction Analyses

    NASA Astrophysics Data System (ADS)

    Kholdebarin, Alireza; Massumi, Ali; Davoodi, Mohammad; Tabatabaiefar, Hamid Reza

    2008-07-01

    It is important to consider the vertical component of earthquake loading and inertia force in soil-structure interaction analyses. In most circumstances, design engineers are primarily concerned about the analysis of behavior of foundations subjected to earthquake-induced forces transmitted from the bedrock. In this research, a single rigid foundation with designated geometrical parameters located on sandy-clay soil has been modeled in FLAC software with Finite Different Method and subjected to three different vertical components of earthquake records. In these cases, it is important to evaluate effect of footing on underlying soil and to consider normal stress in soil with and without footing. The distribution of normal stress under the footing in static and dynamic states has been studied and compared. This Comparison indicated that, increasing in normal stress under the footing caused by vertical component of ground excitations, has decreased dynamic vertical settlement in comparison with static state.

  14. Comparing of Normal Stress Distribution in Static and Dynamic Soil-Structure Interaction Analyses

    SciTech Connect

    Kholdebarin, Alireza; Massumi, Ali; Davoodi, Mohammad; Tabatabaiefar, Hamid Reza

    2008-07-08

    It is important to consider the vertical component of earthquake loading and inertia force in soil-structure interaction analyses. In most circumstances, design engineers are primarily concerned about the analysis of behavior of foundations subjected to earthquake-induced forces transmitted from the bedrock. In this research, a single rigid foundation with designated geometrical parameters located on sandy-clay soil has been modeled in FLAC software with Finite Different Method and subjected to three different vertical components of earthquake records. In these cases, it is important to evaluate effect of footing on underlying soil and to consider normal stress in soil with and without footing. The distribution of normal stress under the footing in static and dynamic states has been studied and compared. This Comparison indicated that, increasing in normal stress under the footing caused by vertical component of ground excitations, has decreased dynamic vertical settlement in comparison with static state.

  15. Comparative toxicity and structure-activity in Chlorella and Tetrahymena: Monosubstituted phenols

    SciTech Connect

    Jaworska, J.S.; Schultz, T.W. )

    1991-07-01

    The relative toxicity of selected monosubstituted phenols has been assessed by Kramer and Truemper in the Chlorella vulgaris assay. The authors examined population growth inhibition of this simple green algae under short-term static conditions for 33 derivatives. However, efforts to develop a strong predictive quantitative structure-activity relationship (QSAR) met with limited success because they modeled across modes of toxic action or segregated derivatives such as positional isomers (i.e., ortho-, meta-, para-). In an effort to further their understanding of the relationships of ecotoxic effects of phenols, the authors have evaluated the same derivatives reported by Kramer and Truemper in the Tetrahymena pyriformis population growth assay, compared the responses in both systems and developed QSARs for the Chlorella vulgaris data based on mechanisms of action.

  16. Comparative Study of 3-Dimensional Woven Joint Architectures for Composite Spacecraft Structures

    NASA Technical Reports Server (NTRS)

    Jones, Justin S.; Polis, Daniel L.; Rowles, Russell R.; Segal, Kenneth N.

    2011-01-01

    The National Aeronautics and Space Administration (NASA) Exploration Systems Mission Directorate initiated an Advanced Composite Technology (ACT) Project through the Exploration Technology Development Program in order to support the polymer composite needs for future heavy lift launch architectures. As an example, the large composite structural applications on Ares V inspired the evaluation of advanced joining technologies, specifically 3D woven composite joints, which could be applied to segmented barrel structures needed for autoclave cured barrel segments due to autoclave size constraints. Implementation of these 3D woven joint technologies may offer enhancements in damage tolerance without sacrificing weight. However, baseline mechanical performance data is needed to properly analyze the joint stresses and subsequently design/down-select a preform architecture. Six different configurations were designed and prepared for this study; each consisting of a different combination of warp/fill fiber volume ratio and preform interlocking method (Z-fiber, fully interlocked, or hybrid). Tensile testing was performed for this study with the enhancement of a dual camera Digital Image Correlation (DIC) system which provides the capability to measure full-field strains and three dimensional displacements of objects under load. As expected, the ratio of warp/fill fiber has a direct influence on strength and modulus, with higher values measured in the direction of higher fiber volume bias. When comparing the Z-fiber weave to a fully interlocked weave with comparable fiber bias, the Z-fiber weave demonstrated the best performance in two different comparisons. We report the measured tensile strengths and moduli for test coupons from the 6 different weave configurations under study.

  17. Bioinformatic analysis of non-VP1 capsid protein of coxsackievirus A6.

    PubMed

    Liu, Hong-Bo; Yang, Guang-Fei; Liang, Si-Jia; Lin, Jun

    2016-08-01

    This study bioinformatically analyzed the non-VP1 capsid proteins (VP2-VP4) of Coxasckievirus A6 (CVA6), with an attempt to predict their basic physicochemical properties, structural/functional features and linear B cell eiptopes. The online tools SubLoc, TargetP and the others from ExPASy Bioinformatics Resource Portal, and SWISS-MODEL (an online protein structure modeling server), were utilized to analyze the amino acid (AA) sequences of VP2-VP4 proteins of CVA6. Our results showed that the VP proteins of CVA6 were all of hydrophilic nature, contained phosphorylation and glycosylation sites and harbored no signal peptide sequences and acetylation sites. Except VP3, the other proteins did not have transmembrane helix structure and nuclear localization signal sequences. Random coils were the major conformation of the secondary structure of the capsid proteins. Analysis of the linear B cell epitopes by employing Bepipred showed that the average antigenic indices (AI) of individual VP proteins were all greater than 0 and the average AI of VP4 was substantially higher than that of VP2 and VP3. The VP proteins all contained a number of potential B cell epitopes and some eiptopes were located at the internal side of the viral capsid or were buried. We successfully predicted the fundamental physicochemical properties, structural/functional features and the linear B cell eiptopes and found that different VP proteins share some common features and each has its unique attributes. These findings will help us understand the pathogenicity of CVA6 and develop related vaccines and immunodiagnostic reagents. PMID:27465341

  18. Chemical proteomic and bioinformatic strategies for the identification and quantification of vascular antigens in cancer.

    PubMed

    Strassberger, Verena; Fugmann, Tim; Neri, Dario; Roesli, Christoph

    2010-09-10

    One avenue towards the development of more selective anti-cancer drugs consists in the targeted delivery of bioactive molecules to the tumor environment by means of binding molecules specific to tumor-associated markers. In this context, the targeted delivery of therapeutic agents to newly-formed blood vessels ("vascular targeting") is particularly attractive, because of the dependence of tumors on new blood vessels to sustain growth and invasion, and because of the accessibility of neo-vascular structures for therapeutic agents injected intravenously. Ligand-based vascular targeting strategies crucially rely on good-quality vascular tumor markers. Here we describe a number of established technologies for the enrichment of accessible vascular proteins based on the isolation of glycoproteins, the in vivo coating of accessible cell surfaces with colloidal silica and the in vivo perfusion with reactive ester derivatives of biotin. Label-free as well as isotopic labeling based strategies for the subsequent MS-based protein quantification are outlined. Finally, bioinformatic workflows for protein quantification are depicted aiming at assisting in the evaluation of appropriate strategies for individual projects. This review gives an overview of current chemical proteomic strategies for the enrichment and quantification of the accessible vascular proteome and helps in selecting bioinformatic strategies for data analysis and validation. PMID:20538087

  19. Using Bioinformatics Approach to Explore the Pharmacological Mechanisms of Multiple Ingredients in Shuang-Huang-Lian

    PubMed Central

    Zhang, Bai-xia; Li, Jian; Gu, Hao; Li, Qiang; Zhang, Qi; Zhang, Tian-jiao; Wang, Yun; Cai, Cheng-ke

    2015-01-01

    Due to the proved clinical efficacy, Shuang-Huang-Lian (SHL) has developed a variety of dosage forms. However, the in-depth research on targets and pharmacological mechanisms of SHL preparations was scarce. In the presented study, the bioinformatics approaches were adopted to integrate relevant data and biological information. As a result, a PPI network was built and the common topological parameters were characterized. The results suggested that the PPI network of SHL exhibited a scale-free property and modular architecture. The drug target network of SHL was structured with 21 functional modules. According to certain modules and pharmacological effects distribution, an antitumor effect and potential drug targets were predicted. A biological network which contained 26 subnetworks was constructed to elucidate the antipneumonia mechanism of SHL. We also extracted the subnetwork to explicitly display the pathway where one effective component acts on the pneumonia related targets. In conclusions, a bioinformatics approach was established for exploring the drug targets, pharmacological activity distribution, effective components of SHL, and its mechanism of antipneumonia. Above all, we identified the effective components and disclosed the mechanism of SHL from the view of system. PMID:26495421

  20. Efficient feature selection and classification of protein sequence data in bioinformatics.

    PubMed

    Iqbal, Muhammad Javed; Faye, Ibrahima; Samir, Brahim Belhaouari; Said, Abas Md

    2014-01-01

    Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727

  1. Using Bioinformatics Approach to Explore the Pharmacological Mechanisms of Multiple Ingredients in Shuang-Huang-Lian.

    PubMed

    Zhang, Bai-xia; Li, Jian; Gu, Hao; Li, Qiang; Zhang, Qi; Zhang, Tian-jiao; Wang, Yun; Cai, Cheng-ke

    2015-01-01

    Due to the proved clinical efficacy, Shuang-Huang-Lian (SHL) has developed a variety of dosage forms. However, the in-depth research on targets and pharmacological mechanisms of SHL preparations was scarce. In the presented study, the bioinformatics approaches were adopted to integrate relevant data and biological information. As a result, a PPI network was built and the common topological parameters were characterized. The results suggested that the PPI network of SHL exhibited a scale-free property and modular architecture. The drug target network of SHL was structured with 21 functional modules. According to certain modules and pharmacological effects distribution, an antitumor effect and potential drug targets were predicted. A biological network which contained 26 subnetworks was constructed to elucidate the antipneumonia mechanism of SHL. We also extracted the subnetwork to explicitly display the pathway where one effective component acts on the pneumonia related targets. In conclusions, a bioinformatics approach was established for exploring the drug targets, pharmacological activity distribution, effective components of SHL, and its mechanism of antipneumonia. Above all, we identified the effective components and disclosed the mechanism of SHL from the view of system. PMID:26495421

  2. A Bioinformatics Approach to Prioritize Single Nucleotide Polymorphisms in TLRs Signaling Pathway Genes.

    PubMed

    Alipoor, Behnam; Ghaedi, Hamid; Omrani, Mir Davood; Bastami, Milad; Meshkani, Reza; Golmohammadi, Taghi

    2016-01-01

    It has been suggested that single nucleotide polymorphisms (SNPs) in genes involved in Toll-like receptors (TLRs) pathway may exhibit broad effects on function of this network and might contribute to a range of human diseases. However, the extent to which these variations affect TLR signaling is not well understood. In this study, we adopted a bioinformatics approach to predict the consequences of SNPs in TLRs network. The consequences of non-synonymous coding SNPs (nsSNPs) were predicted by SIFT, PolyPhen, PANTHER, SNPs&GO, I-Mutant, ConSurf and NetSurf tools. Structural visualization of wild type and mutant protein was performed using the project HOPE and Swiss PDB viewer. The influence of 5'-UTR and 3'- UTR SNPs were analyzed by appropriate computational approaches. Nineteen nsSNPs in TLRs pathway genes were found to have deleterious consequences as predicted by the combination of different algorithms. Moreover, our results suggested that SNPs located at UTRs of TLRs pathway genes may potentially influence binding of transcription factors or microRNAs. By applying a pathway-based bioinformatics analysis of genetic variations, we provided a prioritized list of potentially deleterious variants. These findings may facilitate the selection of proper variants for future functional and/or association studies. PMID:27478803

  3. PyDPI: freely available python package for chemoinformatics, bioinformatics, and chemogenomics studies.

    PubMed

    Cao, Dong-Sheng; Liang, Yi-Zeng; Yan, Jun; Tan, Gui-Shan; Xu, Qing-Song; Liu, Shao

    2013-11-25

    The rapidly increasing amount of publicly available data in biology and chemistry enables researchers to revisit interaction problems by systematic integration and analysis of heterogeneous data. Herein, we developed a comprehensive python package to emphasize the integration of chemoinformatics and bioinformatics into a molecular informatics platform for drug discovery. PyDPI (drug-protein interaction with Python) is a powerful python toolkit for computing commonly used structural and physicochemical features of proteins and peptides from amino acid sequences, molecular descriptors of drug molecules from their topology, and protein-protein interaction and protein-ligand interaction descriptors. It computes 6 protein feature groups composed of 14 features that include 52 descriptor types and 9890 descriptors, 9 drug feature groups composed of 13 descriptor types that include 615 descriptors. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pair fingerprints, topological torsion fingerprints, and Morgan/circular fingerprints. By combining different types of descriptors from drugs and proteins in different ways, interaction descriptors representing protein-protein or drug-protein interactions could be conveniently generated. These computed descriptors can be widely used in various fields relevant to chemoinformatics, bioinformatics, and chemogenomics. PyDPI is freely available via https://sourceforge.net/projects/pydpicao/. PMID:24047419

  4. A comparative fine structural and phylogenetic analysis of resting cysts in oligotrich and hypotrich Spirotrichea (Ciliophora)

    PubMed Central

    Foissner, Wilhelm; Müller, Helga; Agatha, Sabine

    2010-01-01

    So far, neither morphology nor gene sequences have provided a reliable classification of halteriid and hypotrichid spirotrichs. Thus, we performed a comparative study on the fine structure of the resting cysts in some representative species, viz., the oligotrichs Halteria grandinella and Pelagostrombidium fallax and the oxytrichid hypotrichs Laurentiella strenua, Steinia sphagnicola, and Oxytricha granulifera. Main results include: (i) there are three different, very likely non-homologous cyst surface ornamentations, viz., spines (generated by the ectocyst), thorns (generated by the mesocyst), and lepidosomes (produced in the cytoplasm); (ii) Halteria has a perilemma; (iii) Halteria, Meseres and Pelagostrombidium have fibrous lepidosomes, while those of Oxytricha are tubular; (iv) the cyst wall structure of Pelagostrombidium and Strombidium is distinctly different from that of halteriids and oxytrichids, which are rather similar in this respect; (v) cyst ornamentation does not provide a reliable phylogenetic signal in oxytrichid hypotrichs because ectocyst spines occur in both flexible and rigid genera. The new observations and literature data were used to investigate the phylogeny of the core Spirotrichea. The Hennigian argumentation scheme and computer algorithms showed that the spirotrichs are bound together by the macronuclear reorganization band, the subepiplasmic microtubule basket, and the apokinetal stomatogenesis. The Hypotrichida and Oligotrichida are united by a very strong synapomorphy, viz., the perilemma, not found in any other member of the phylum. Halteriid and oligotrichid spirotrichs form a sister group supported by as many as 13 apomorphies. Thus, the molecular data, which classify the halteriids within the core hypotrichs, need to be reconsidered. PMID:17766095

  5. Structure characterization of three polysaccharides and a comparative study of their immunomodulatory activities on chicken macrophage.

    PubMed

    Fan, Wentao; Zhang, Shijie; Hao, Pan; Zheng, Pimiao; Liu, Jianzhu; Zhao, Xiaona

    2016-11-20

    In this study, we evaluated structure characterization and immunomodulatory activity of polysaccharides from Astragalus aboriginum Richardson (RAPS), Atractylodes macrocephala Koidz (RAMPS) and Rumia seseloides Hoffm (RSPS) in vitro on chicken macrophage. We found that molecular weight of RAPS and RAMPS was 122.4 and 109.4kDa higher than 64.71kDa of RSPS. Glucose occupied 83.95% and 66.39% in RAPS and RAMPS, respectively. RSPS mainly contained glucose and galacturonic acid, which accounted for 32.35% and 29.25%, respectively. The NMR results displayed that RAPS and RAMPS contained β- glucose, β-galactose, and β-galacturonic acid. The backbone was 1→6 linked glucose. RSPS showed at least six monosaccharide response signals. In vitro experiment, the results showed that RAPS at dosage of 15.62μgmL(-1) exhibited significant immunological on chicken macrophage compared to RAMPS and RSPS. Interestingly, costimulatory molecules levels in RSPS group were higher than that of RAPS, which may associated with the special structure of RSPS. PMID:27561535

  6. Amination of nitroazoles--a comparative study of structural and energetic properties.

    PubMed

    Zhao, Xiuxiu; Qi, Cai; Zhang, Lubo; Wang, Yuan; Li, Shenghua; Zhao, Fengqi; Pang, Siping

    2014-01-01

    In this work, 3-nitro-1H-1,2,4-triazole (1) and 3,5-dinitro-1H-pyrazole (2) were C-aminated and N-aminated using different amination agents, yielding their respective C-amino and N-amino products. All compounds were fully characterized by NMR (1H, 13C, 15N), IR spectroscopy, differential scanning calorimetry (DSC). X-ray crystallographic measurements were performed and delivered insight into structural characteristics as well as inter- and intramolecular interactions of the products. Their impact sensitivities were measured by using standard BAM fallhammer techniques and their explosive performances were computed using the EXPLO 5.05 program. A comparative study on the influence of those different amino substituents on the structural and energetic properties (such as density, stability, heat of formation, detonation performance) is presented. The results showed that the incorporation of an N-amino group into a nitroazole ring can improve nitrogen content, heat of formation and impact sensitivity, while the introduction of a C-amino group can enhance density, detonation velocity and pressure. The potential of N-amino and C-amino moieties for the design of next generation energetic materials is explored. PMID:24424403

  7. Comparing Different Model Structures for Carbon Allocation in the Community Land Model (CLM)

    NASA Astrophysics Data System (ADS)

    Montane, F.; Fox, A. M.; Arellano, A. F.; Scaven, V. L.; Alexander, M. R.; Moore, D. J.

    2015-12-01

    Quantifying the intensity of feedback mechanisms between terrestrial ecosystems and climate is a central challenge for understanding the global carbon cycle. Part of this challenge includes understanding how climate affects not only NPP, but also C allocation in different plant tissues (leaves, stem and roots) which determines the C residence time. For instance, C could be sequestered over longer time periods if changes in climate increase allocation to long-lived plant tissue (e.g. woody components) with respect to short-lived tissues (e.g. leaves). Networks of eddy covariance towers like AmeriFlux provide the infrastructure necessary to study relationships between ecosystem processes and climate forcing. We ran the Community Land Model (CLM) for six temperate forests in North America (AmeriFlux sites) using different model structures for the C allocation module: i) standard carbon allocation module in CLM, which allocates C to the stem and leaves as a dynamic function of NPP and with fixed coefficients for the rest of parameters; ii) alternative C allocation module, which allocates C to the root and stem as a dynamic function of NPP and with fixed coefficients for the rest of parameters; and iii) alternative C allocation module with fixed coefficients for all the parameters. We compare C allocation patterns and climate sensitivities betwen the different model structures and available observations for the sites. We suggest some future approaches to reduce model uncertainty in the current scheme for C allocation in CLM and its climate sensitivity.

  8. Comparative structural study of N-linked oligosaccharides of urinary and recombinant erythropoietins.

    PubMed

    Tsuda, E; Goto, M; Murakami, A; Akai, K; Ueda, M; Kawanishi, G; Takahashi, N; Sasaki, R; Chiba, H; Ishihara, H

    1988-07-26

    The structures of the N-linked oligosaccharides of the urinary erythropoietin (u-EPO) purified from urine of aplastic anemic patients were analyzed and compared with those for recombinant erythropoietin (r-EPO) prepared with baby hamster kidney (BHK) cells. Asparagine-linked neutral oligosaccharides were released from each EPO protein by N-oligosaccharide glycopeptidase (almond) digestion. The reducing ends of the oligosaccharide chains thus obtained were aminated with a fluorescent reagent, 2-aminopyridine, and the mixture of pyridylamino derivatives of the oligosaccharides was separated by high-performance liquid chromatography (HPLC) on an ODS silica column. More than 8 and 13 kinds of oligosaccharide fractions for u-EPO and r-EPO (BHK), respectively, were completely separated by the one-step HPLC procedure. The structure of each oligosaccharide thus isolated was analyzed by a combination of sequential exoglycosidase digestion and another kind of HPLC with an amide-silica column. Furthermore, high-resolution proton nuclear magnetic resonance (1H NMR) spectroscopy and methylation analyses were carried out in the case of r-EPO (BHK).(ABSTRACT TRUNCATED AT 250 WORDS) PMID:3179269

  9. Genomic Characterization of Prenatally Detected Chromosomal Structural Abnormalities Using Oligonucleotide Array Comparative Genomic Hybridization

    PubMed Central

    Li, Peining; Pomianowski, Pawel; DiMaio, Miriam S.; Florio, Joanne R.; Rossi, Michael R.; Xiang, Bixia; Xu, Fang; Yang, Hui; Geng, Qian; Xie, Jiansheng; Mahoney, Maurice J.

    2013-01-01

    Detection of chromosomal structural abnormalities using conventional cytogenetic methods poses a challenge for prenatal genetic counseling due to unpredictable clinical outcomes and risk of recurrence. Of the 1,726 prenatal cases in a 3-year period, we performed oligonucleotide array comparative genomic hybridization (aCGH) analysis on 11 cases detected with various structural chromosomal abnormalities. In nine cases, genomic aberrations and gene contents involving a 3p distal deletion, a marker chromosome from chromosome 4, a derivative chromosome 5 from a 5p/7q translocation, a de novo distal 6q deletion, a recombinant chromosome 8 comprised of an 8p duplication and an 8q deletion, an extra derivative chromosome 9 from an 8p/9q translocation, mosaicism for chromosome 12q with added material of initially unknown origin, an unbalanced 13q/15q rearrangement, and a distal 18q duplication and deletion were delineated. An absence of pathogenic copy number changes was noted in one case with a de novo 11q/14q translocation and in another with a familial insertion of 21q into a 19q. Genomic characterization of the structural abnormalities aided in the prediction of clinical outcomes. These results demonstrated the value of aCGH analysis in prenatal cases with subtle or complex chromosomal rearrangements. Furthermore, a retrospective analysis of clinical indications of our prenatal cases showed that approximately 20% of them had abnormal ultrasound findings and should be considered as high risk pregnancies for a combined chromosome and aCGH analysis. PMID:21671377

  10. A new structure for comparing surface passivation materials of GaAs solar cells

    NASA Technical Reports Server (NTRS)

    Desalvo, Gregory C.; Barnett, Allen M.

    1989-01-01

    The surface recombination velocity (S sub rec) for bare GaAs is typically as high as 10 to the 6th power to 10 to the 7th power cm/sec, which dramatically lowers the efficiency of GaAs solar cells. Early attempts to circumvent this problem by making an ultra thin junction (xj less than .1 micron) proved unsuccessful when compared to lowering S sub rec by surface passivation. Present day GaAs solar cells use an GaAlAs window layer to passivate the top surface. The advantages of GaAlAs in surface passivation are its high bandgap energy and lattice matching to GaAs. Although GaAlAs is successful in reducing the surface recombination velocity, it has other inherent problems of chemical instability (Al readily oxidizes) and ohmic contact formation. The search for new, more stable window layer materials requires a means to compare their surface passivation ability. Therefore, a device structure is needed to easily test the performance of different passivating candidates. Such a test device is described.

  11. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation

    PubMed Central

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-01-01

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  12. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation.

    PubMed

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-06-20

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  13. STRUCTURES OF LOCAL GALAXIES COMPARED TO HIGH-REDSHIFT STAR-FORMING GALAXIES

    SciTech Connect

    Petty, Sara M.; De Mello, DuIlia F.; Gallagher, John S.; Gardner, Jonathan P.; Lotz, Jennifer M.; Matt Mountain, C.; Smith, Linda J.

    2009-08-15

    The rest-frame far-ultraviolet morphologies of eight nearby interacting and starburst galaxies (Arp 269, M 82, Mrk 8, NGC 520, NGC 1068, NGC 3079, NGC 3310, and NGC 7673) are compared with 54 galaxies at z {approx} 1.5 and 46 galaxies at z {approx} 4 observed in the Great Observatories Origins Deep Survey (GOODS) taken with the Advanced Camera for Surveys onboard the Hubble Space Telescope. The nearby sample is artificially redshifted to z {approx} 1.5 and 4 by applying luminosity and size scaling. We compare the simulated galaxy morphologies to real z {approx} 1.5 and 4 UV-bright galaxy morphologies. We calculate the Gini coefficient (G), the second-order moment of the brightest 20% of the galaxy's flux (M {sub 20}), and the Sersic index (n). We explore the use of nonparametric methods with two-dimensional profile fitting and find the combination of M {sub 20} with n an efficient method to classify galaxies as having merger, exponential disk, or bulge-like morphologies. When classified according to G and M {sub 20} 20/30% of real/simulated galaxies at z {approx} 1.5 and 37/12% at z {approx} 4 have bulge-like morphologies. The rest have merger-like or intermediate distributions. Alternatively, when classified according to the Sersic index, 70% of the z {approx} 1.5 and z {approx} 4 real galaxies are exponential disks or bulge-like with n>0.8, and {approx} 30% of the real galaxies are classified as mergers. The artificially redshifted galaxies have n values with {approx} 35% bulge or exponential at z {approx} 1.5 and 4. Therefore, {approx} 20%-30% of Lyman-break galaxies have structures similar to local starburst mergers, and may be driven by similar processes. We assume merger-like or clumpy star-forming galaxies in the GOODS field have morphological structure with values n < 0.8 and M {sub 20}> - 1.7. We conclude that Mrk 8, NGC 3079, and NGC 7673 have structures similar to those of merger-like and clumpy star-forming galaxies observed at z {approx} 1.5 and 4.

  14. Predictive value of tender joints compared to synovitis for structural damage in rheumatoid arthritis

    PubMed Central

    Cheung, Peter P; Mari, Karine; Devauchelle-Pensec, Valérie; Jousse-Joulin, Sandrine; D'Agostino, Maria Antonietta; Chalès, Gérard; Gaudin, Philippe; Mariette, Xavier; Saraux, Alain; Dougados, Maxime

    2016-01-01

    Objective To evaluate the predictive value of tender joints compared to synovitis for structural damage in rheumatoid arthritis (RA). Methods A post hoc analysis was performed on a prospective 2-year study of 59 patients with active RA starting on antitumour necrosis factor (TNF). Tenderness and synovitis was assessed clinically at baseline, followed by blinded ultrasound assessment (B-mode and power Doppler ultrasound (PDUS)) on the hands and feet (2 wrists, 10 metacarpophalangeal, 10 proximal interphalangeal and 10 metatarsophalangeal (MTP) joints). Radiographs of these joints were performed at baseline and at 2 years. The risk of radiographic progression with respect to the presence of baseline tenderness or synovitis, as well as its persistence (after 4 months of anti-TNF), was estimated by OR (95% CI). Results Baseline tender joints were the least predictive for radiographic progression (OR=1.53 (95% CI 1.02 to 2.29) p<0.04), when compared to synovitis (clinical OR=2.08 (95% CI 1.39 to 3.11) p<0.001 or PDUS OR=1.80 (95% CI 1.20 to 2.71) p=0.005, respectively). Tender joints with the presence of synovitis were predictive of radiographic progression (OR=1.89 (95% CI 1.25 to 2.85) p=0.002), especially seen in the MTP joints. Non-tender joints with no synovitis were negatively predictive (OR=0.57 (95% CI 0.39 to 0.82) p=0.003). Persistence of tender joints was negatively predictive (OR=0.38 (95% CI 0.18 to 0.78) p=0.009) while persistence of synovitis was predictive (OR=2.41 (95% CI 1.24 to 4.67) p=0.01) of radiographic progression. Conclusions Synovitis is better than tenderness to predict for subsequent structural progression. However, coexistence of tenderness and synovitis at the level of an individual joint is predictive of structural damage, particularly in the MTP joints. Trial registration number NCT00444691. PMID:27042336

  15. [Comparative Study on the Molecular Structures and Spectral Properties of Ponceau 4R and Amaranth].

    PubMed

    Zhang, Yong; Chen, Guo-qing; Zhu, Chun; Hu, Yang-jun

    2015-11-01

    The Edinburgh FLS920P steady-instantaneous fluorescence spectrometer was applied on the detection of the absorption and the emission spectra of ponceau 4R and amaranth, which are isomers to each other. After that, the spectral parameters of them were compared. Then, the density functional theory (DFT) and time-dependent density functional theory (TD-DFT) were used on the optimization of ponceau 4R and amaranth under the ground and excited state, respectively, in order to compare the differences in configurations of them under different states. On the base of the results above, the absorption and emission spectra of the two isomers were calculated with TD-DFT, and the polarized continuum model (PCM) was applied on the base of 6-311++G (d, p). The fluorescence mechanism, the relationships between the properties of fluorescence spectra and the molecular geometry were all analyzed. The results shows that, the structures of the two molecules are non-planar, these two naphthalene rings are not co-planar, respectively, and there's hydrogen bond in amaranth. When the two isomers were on the ground state, the planarity of the naphthalene ring which exists the hydrogen bond mentioned above in amaranth is better than the corresponding part of ponceau 4R. The two isomers are nearly co-planar when they're on the excited state. The molecular structures of ponceau 4R and amaranth optimized above are basically reasonable, for the quantum chemistry calculation spectral results are agree with the experiments. The planarity of the naphthalene rings on the right side in ponceau 4R is worse than that in amaranth, the ponceau 4R molecule experienced more vibration and rotation from the excited to the ground state, lost more energy, which lead to the reduction of energy for emitting fluorescent photons. So ponceau 4R has longer fluorescence emission wave- length than amaranth. In this paper, the molecular structure information of ponceau 4R and amaranth were obtained, and the differences

  16. Comparative Genome Analyses Reveal Distinct Structure in the Saltwater Crocodile MHC

    PubMed Central

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M.; Shan, Xueyan; Peterson, Daniel G.; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M.; Isberg, Sally R.; Higgins, Damien P.; Chong, Amanda Y.; John, John St; Glenn, Travis C.; Ray, David A.; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2–6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  17. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    PubMed

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M; Shan, Xueyan; Peterson, Daniel G; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M; Isberg, Sally R; Higgins, Damien P; Chong, Amanda Y; John, John St; Glenn, Travis C; Ray, David A; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  18. Tools and data services registry: a community effort to document bioinformatics resources.

    PubMed

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C E; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  19. Tools and data services registry: a community effort to document bioinformatics resources

    PubMed Central

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C.E.; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  20. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses.

    PubMed

    Wang, Lei; Xu, Wenrong; Cao, Lei; Tian, Tian; Yang, Mifang; Li, Zhongming; Ping, Fengfeng; Fan, Weixin

    2016-01-01

    Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days). Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease. PMID:26752403

  1. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses

    PubMed Central

    Tian, Tian; Yang, Mifang; Li, Zhongming; Ping, Fengfeng; Fan, Weixin

    2016-01-01

    Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days). Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease. PMID:26752403

  2. Bioinformatics approaches for the classification of G-protein-coupled receptors.

    PubMed

    Gaulton, Anna; Attwood, Teresa K

    2003-04-01

    G-protein-coupled receptors are found abundantly in the human genome, and are the targets of numerous prescribed drugs. However, many receptors remain orphaned (i.e. with unknown ligand specificity), and others remain poorly characterised, with little structural information available. Consequently, there is often a gulf between sequence data and structural and functional knowledge of a receptor. Bioinformatics approaches may offer one approach to bridging this gap. In particular, protein family databases, which distil information from multiple sequence alignments into characteristic signatures, could be used to identify the families to which orphan receptors belong, and might facilitate discovery of novel motifs associated with ligand binding and G-protein-coupling. PMID:12681231

  3. BioAfrica's HIV-1 proteomics resource: combining protein data with bioinformatics tools.

    PubMed

    Doherty, Ryan S; De Oliveira, Tulio; Seebregts, Chris; Danaviah, Sivapragashini; Gordon, Michelle; Cassol, Sharon

    2005-01-01

    Most Internet online resources for investigating HIV biology contain either bioinformatics tools, protein information or sequence data. The objective of this study was to develop a comprehensive online proteomics resource that integrates bioinformatics with the latest information on HIV-1 protein structure, gene expression, post-transcriptional/post-translational modification, functional activity, and protein-macromolecule interactions. The BioAfrica HIV-1 Proteomics Resource http://bioafrica.mrc.ac.za/proteomics/index.html is a website that contains detailed information about the HIV-1 proteome and protease cleavage sites, as well as data-mining tools that can be used to manipulate and query protein sequence data, a BLAST tool for initiating structural analyses of HIV-1 proteins, and a proteomics tools directory. The Proteome section contains extensive data on each of 19 HIV-1 proteins, including their functional properties, a sample analysis of HIV-1HXB2, structural models and links to other online resources. The HIV-1 Protease Cleavage Sites section provides information on the position, subtype variation and genetic evolution of Gag, Gag-Pol and Nef cleavage sites. The HIV-1 Protein Data-mining Tool includes a set of 27 group M (subtypes A through K) reference sequences that can be used to assess the influence of genetic variation on immunological and functional domains of the protein. The BLAST Structure Tool identifies proteins with similar, experimentally determined topologies, and the Tools Directory provides a categorized list of websites and relevant software programs. This combined database and software repository is designed to facilitate the capture, retrieval and analysis of HIV-1 protein data, and to convert it into clinically useful information relating to the pathogenesis, transmission and therapeutic response of different HIV-1 variants. The HIV-1 Proteomics Resource is readily accessible through the BioAfrica website at: http

  4. Bioinformatic prediction of epitopes in the Emy162 antigen of Echinococcus multilocularis

    PubMed Central

    LI, YANHUA; LIU, XIANFEI; ZHU, YUEJIE; ZHOU, XIAOTAO; CAO, CHUNBAO; HU, XIAOAN; MA, HAIMEI; WEN, HAO; MA, XIUMIN; DING, JIAN-BING

    2013-01-01

    The aim of the present study was to predict the secondary structure and the T- and B-cell epitopes of the Echinococcus multilocularis Emy162 antigen, in order to reveal the dominant epitopes of the antigen. The secondary structure of the protein was analyzed using the Gamier-Robson method, and the improved self-optimized prediction method (SOPMA) server. The T- and B-cell epitopes of Emy162 were predicted using Immune Epitope Database (IEDB), Syfpeithi, Bcepred and ABCpred online software. The characteristics of hydrophilicity, flexibility, antigenic propensity and exposed surface area were predicted. The tertiary structure of the Emy162 protein was predicted by the 3DLigandSite server. The results demonstrated that random coils and β sheets accounted for 34.64 and 21.57% of the secondary structure of the Emy162 protein, respectively. This was indicative of the presence of potential dominant antigenic epitopes in Emy162. Following bioinformatic analysis, numerous distinct antigenic epitopes of Emy162 were identified. The high-scoring T-cell epitopes were located at positions 16–29, 36–39, 97–103, 119–125 and 128–135, whilst the likely B-cell epitopes were located at positions 8–10, 19–25, 44–50, 74–81, 87–93, 104–109 and 128–136. In conclusion, five T-cell and seven B-cell dominant epitopes of the Emy162 antigen were revealed by the bioinformatic methods, which may be of use in the development of a dominant epitope vaccine. PMID:24137185

  5. Versatility of the Burkholderia cepacia Complex for the Biosynthesis of Exopolysaccharides: A Comparative Structural Investigation

    PubMed Central

    Silipo, Alba; Lanzetta, Rosa; Liut, Gianfranco; Rizzo, Roberto; Cescutti, Paola

    2014-01-01

    The Burkholderia cepacia Complex assembles at least eighteen closely related species that are ubiquitous in nature. Some isolates show beneficial potential for biocontrol, bioremediation and plant growth promotion. On the contrary, other strains are pathogens for plants and immunocompromised individuals, like cystic fibrosis patients. In these subjects, they can cause respiratory tract infections sometimes characterised by fatal outcome. Most of the Burkholderia cepacia Complex species are mucoid when grown on a mannitol rich medium and they also form biofilms, two related characteristics, since polysaccharides are important component of biofilm matrices. Moreover, polysaccharides contribute to bacterial survival in a hostile environment by inhibiting both neutrophils chemotaxis and antimicrobial peptides activity, and by scavenging reactive oxygen species. The ability of these microorganisms to produce exopolysaccharides with different structures is testified by numerous articles in the literature. However, little is known about the type of polysaccharides produced in biofilms and their relationship with those obtained in non-biofilm conditions. The aim of this study was to define the type of exopolysaccharides produced by nine species of the Burkholderia cepacia Complex. Two isolates were then selected to compare the polysaccharides produced on agar plates with those formed in biofilms developed on cellulose membranes. The investigation was conducted using NMR spectroscopy, high performance size exclusion chromatography, and gas chromatography coupled to mass spectrometry. The results showed that the Complex is capable of producing a variety of exopolysaccharides, most often in mixture, and that the most common exopolysaccharide is always cepacian. In addition, two novel polysaccharide structures were determined: one composed of mannose and rhamnose and another containing galactose and glucuronic acid. Comparison of exopolysaccharides obtained from cultures on

  6. Versatility of the Burkholderia cepacia complex for the biosynthesis of exopolysaccharides: a comparative structural investigation.

    PubMed

    Cuzzi, Bruno; Herasimenka, Yury; Silipo, Alba; Lanzetta, Rosa; Liut, Gianfranco; Rizzo, Roberto; Cescutti, Paola

    2014-01-01

    The Burkholderia cepacia Complex assembles at least eighteen closely related species that are ubiquitous in nature. Some isolates show beneficial potential for biocontrol, bioremediation and plant growth promotion. On the contrary, other strains are pathogens for plants and immunocompromised individuals, like cystic fibrosis patients. In these subjects, they can cause respiratory tract infections sometimes characterised by fatal outcome. Most of the Burkholderia cepacia Complex species are mucoid when grown on a mannitol rich medium and they also form biofilms, two related characteristics, since polysaccharides are important component of biofilm matrices. Moreover, polysaccharides contribute to bacterial survival in a hostile environment by inhibiting both neutrophils chemotaxis and antimicrobial peptides activity, and by scavenging reactive oxygen species. The ability of these microorganisms to produce exopolysaccharides with different structures is testified by numerous articles in the literature. However, little is known about the type of polysaccharides produced in biofilms and their relationship with those obtained in non-biofilm conditions. The aim of this study was to define the type of exopolysaccharides produced by nine species of the Burkholderia cepacia Complex. Two isolates were then selected to compare the polysaccharides produced on agar plates with those formed in biofilms developed on cellulose membranes. The investigation was conducted using NMR spectroscopy, high performance size exclusion chromatography, and gas chromatography coupled to mass spectrometry. The results showed that the Complex is capable of producing a variety of exopolysaccharides, most often in mixture, and that the most common exopolysaccharide is always cepacian. In addition, two novel polysaccharide structures were determined: one composed of mannose and rhamnose and another containing galactose and glucuronic acid. Comparison of exopolysaccharides obtained from cultures on

  7. Comparative electronic structure of a lanthanide and actinide diatomic oxide: Nd versus U

    NASA Astrophysics Data System (ADS)

    Krauss, M.; Stevens, W. J.

    2003-01-01

    Using a modified version of the Alchemy electronic structure code and relativistic pseudopotentials, the electronic structure of the ground and low lying excited states of UO, NdO, and NdO + have been calculated at the Hartree-Fock (HF) and multiconfiguration self-consistent field (MCSCF) levels of theory. Including results from an earlier study of UO + this provides the information for a comparative analysis of a lanthanide and an actinide diatomic oxide. UO and NdO are both described formally as M +2 O -2 and the cations as M +3 O -2 , but the HF and MCSCF calculations show that these systems are considerably less ionic due to large charge back-transfer in the πorbitals. The electronic states putatively arise from the ligand field (oxygen anion) perturbed f 4 , sf 3 , df 3 , sdf 2 , or s 2 f 2 states of M +2 and f 3 , sf 2 or df 2 states of M +3 . Molecular orbital results show a substantial stabilization of the sf 3 or s 2 f 2 configurations relative to the f 4 or df 3 configurations that are the even or odd parity ground states in the M +2 free ion. The compact f and d orbitals are more destabilized by the anion field than the diffuse s orbital. The ground states of the neutral species are dominated by orbitals arising from the M +2 sf 3 term, and all the potential energy curves arising from this configuration are similar, which allows an estimate of the vibrational frequencies for UO and NdO of 862 cm -1 and 836 cm -1 , respectively. For NdO + and UO + the excitation energies for the Ωstates were calculated with a valence configuration interaction method using ab initio effective spin-orbit operators to couple the molecular orbital configurations. The results for NdO + are very comparable with the results for UO + , and show the vibrational and electronic states to be interleaved.

  8. Composable languages for bioinformatics: the NYoSh experiment.

    PubMed

    Simi, Manuele; Campagne, Fabien

    2014-01-01

    Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is

  9. Composable languages for bioinformatics: the NYoSh experiment

    PubMed Central

    Simi, Manuele

    2014-01-01

    Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is

  10. SPOT--towards temporal data mining in medicine and bioinformatics.

    PubMed

    Tusch, Guenter; Bretl, Chris; O'Connor, Martin; Connor, Martin; Das, Amar

    2008-01-01

    Mining large clinical and bioinformatics databases often includes exploration of temporal data. E.g., in liver transplantation, researchers might look for patients with an unusual time pattern of potential complications of the liver. In Knowledge-based Temporal Abstraction time-stamped data points are transformed into an interval-based representation. We extended this framework by creating an open-source platform, SPOT. It supports the R statistical package and knowledge representation standards (OWL, SWRL) using the open source Semantic Web tool Protégé-OWL. PMID:18999225

  11. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    NASA Astrophysics Data System (ADS)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  12. WU-Blast2 server at the European Bioinformatics Institute

    PubMed Central

    Lopez, Rodrigo; Silventoinen, Ville; Robinson, Stephen; Kibria, Asif; Gish, Warren

    2003-01-01

    Since 1995, the WU-BLAST programs (http://blast.wustl.edu) have provided a fast, flexible and reliable method for similarity searching of biological sequence databases. The software is in use at many locales and web sites. The European Bioinformatics Institute's WU-Blast2 (http://www.ebi.ac.uk/blast2/) server has been providing free access to these search services since 1997 and today supports many features that both enhance the usability and expand on the scope of the software. PMID:12824421

  13. A comparative analysis of the triloops in all high-resolution RNA structures reveals sequence–structure relationships

    PubMed Central

    Lisi, Véronique; Major, François

    2007-01-01

    Despite an increasing number of experimentally determined RNA structures, the gap between the number of structures and that of RNA families is still growing. To overcome this limitation, efficient and reliable RNA modeling methodologies must be developed. In order to reach this goal, here, we show how triloop sequence–structure relationships have been inferred through a systematic analysis of all triloops found in available high-resolution structures. The structural annotation of all triloops allowed us to define discrete states of the triloop's conformational space, and therefore an explicit sequence-to-structure relation. The sequence–structure relationships inferred from this explicit relation are presented in a convenient modeling table that provides a limited set of possible three-dimensional structures given any triloop sequence. The table is indexed by the two nucleotides that form the triloop's flanking base pair, since they are shown to provide the most information about the triloop three-dimensional structures. We also report the observations in the X-ray crystallographic structures of important conformational variations, which we believe might be the result of RNA dynamic. PMID:17652406

  14. Cytogenomic mapping and bioinformatic mining reveal interacting brain expressed genes for intellectual disability

    PubMed Central

    2014-01-01

    Background Microarray analysis has been used as the first-tier genetic testing to detect chromosomal imbalances and copy number variants (CNVs) for pediatric patients with intellectual and developmental disabilities (ID/DD). To further investigate the candidate genes and underlying dosage-sensitive mechanisms related to ID, cytogenomic mapping of critical regions and bioinformatic mining of candidate brain-expressed genes (BEGs) and their functional interactions were performed. Critical regions of chromosomal imbalances and pathogenic CNVs were mapped by subtracting known benign CNVs from the Databases of Genomic Variants (DGV) and extracting smallest overlap regions with cases from DatabasE of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources (DECIPHER). BEGs from these critical regions were revealed by functional annotation using Database for Annotation, Visualization, and Integrated Discovery (DAVID) and by tissue expression pattern from Uniprot. Cross-region interrelations and functional networks of the BEGs were analyzed using Gene Relationships Across Implicated Loci (GRAIL) and Ingenuity Pathway Analysis (IPA). Results Of the 1,354 patients analyzed by oligonucleotide array comparative genomic hybridization (aCGH), pathogenic abnormalities were detected in 176 patients including genomic disorders in 66 patients (37.5%), subtelomeric rearrangements in 45 patients (25.6%), interstitial imbalances in 33 patients (18.8%), chromosomal structural rearrangements in 17 patients (9.7%) and aneuploidies in 15 patients (8.5%). Subtractive and extractive mapping defined 82 disjointed critical regions from the detected abnormalities. A total of 461 BEGs was generated from 73 disjointed critical regions. Enrichment of central nervous system specific genes in these regions was noted. The number of BEGs increased with the size of the regions. A list of 108 candidate BEGs with significant cross region interrelation was identified by GRAIL and five

  15. A comparative study of the proventricular structure in corbiculate apinae (Hymenoptera, Apidae).

    PubMed

    Serrão, J E

    2001-06-01

    The present study compares the proventricular structure, analyzed under scanning electronic microscope (SEM), among tribes of corbiculate Apinae. Fifty-one species of stingless bees (Meliponini), one species of honeybee (Apini), three species of bumblebees (Bombini) and seven species of orchid bees (Euglossini), were analyzed as in-group, and one species of sphecid wasp (Sphecidae) and two species of Halictidae bees, as out-groups. The proventricular bulb presents a basic morphology pattern similar to that of other Hymenoptera such as ants and wasps, being a symplesiomorphy for bees. The shape of proventricular folds constitutes a synapomorphy for Meliponini and an autapomorphy for Apini. The shape of hair-like projections of the cuticle that lines the proventriculus is a synapomorphy for Meliponini and Apini. These proventricular data corroborate the monophyly of the tribe Meliponini and the hypothesis that recognizes only one tribe for stingless bees. In addition, Meliponini+Apini constitutes a monophyletic group and Bombini+Euglossini another monophyletic group. The results confirm that internal morphology is a character that can be used in studies of the phylogeny in insects and the use of SEM as a powerful tool in these analyses. PMID:11070358

  16. Structural dynamics and inhibitor searching for Wnt-4 protein using comparative computational studies

    PubMed Central

    Hammad, Mirza A; Azam, Syed Sikander

    2015-01-01

    Wnt-4 (wingless mouse mammary tumor virus integration site-4) protein is involved in many crucial embryonic pathways regulating essential processes. Aberrant Wnt-4 activity causes various anomalies leading to gastric, colon, or breast cancer. Wnt-4 is a conserved protein in structure and sequence. All Wnt proteins contain an unusual fold comprising of a thumb (or N-terminal domain) and index finger (or C-terminal domain) bifurcated by a palm domain. The aim of this study was to identify the best inhibitors of Wnt-4 that not only interact with Wnt-4 protein but also with the covalently bound acyl group to inhibit aberrant Wnt-4 activity. A systematic computational approach was used to analyze inhibition of Wnt-4. Palmitoleic acid was docked into Wnt-4 protein, followed by ligand-based virtual screening of nearly 209,847 compounds; conformer generation of 271 compounds resulted from extensive virtual screening and comparative docking of 10,531 conformers of 271 unique compounds through GOLD (Genetic Optimization for Ligand Docking), AutoDock-Vina, and FRED (Fast Rigid Exhaustive Docking) was subsequently performed. Linux scripts was used to handle the libraries of compounds. The best compounds were selected on the basis of having maximum interactions to protein with bound palmitoleic acid. These represented lead inhibitors in further experiments. Palmitoleic acid is important for efficient Wnt activity, but aberrant Wnt-4 expression can be inhibited by designing inhibitors interacting with both protein and palmitoleic acid. PMID:25995617

  17. Comparative study of the hypotensive effect of a group of structural derivatives of glaucine.

    PubMed

    Todorov, S; Zamfirova, R

    1991-01-01

    A comparative study was made on the hypotensive effect of a group of dehydrogenated structural derivatives of the alkaloid glaucine. The compounds studied induced a slowly occurring marked decrease in the blood pressure. Applied intravenously, they did not manifest the initial brief and very pronounced phase of the hypotensive effect, typical of glaucine, and failed to change substantially the respiration and the cardiac activity of the experimental animals. The most marked hypotensive effect was demonstrated by 7-benzoyl-dehydroglaucine (DG4), which reduced the blood pressure by about 50 and 60% respectively, when applied in doses of 1 mg/kg and 2.5 mg/kg. Applied duodenally, the dehydrogenated glaucine derivatives also manifested a gradually occurring hypotensive effect, whereby DG4 again caused the most pronounced blood pressure drop. Depending on the DG4 and glaucine doses used, the pressor effects of noradrenaline (NA) and nicotine (NIC) were moderately to strongly suppressed or completely inhibited. In experiments on cat membrana nictitans glaucine also suppressed moderately (2.5 mg/kg) or markedly (5 mg/kg) the contractile effects of NIC and NA, while DG4 did not influence (1 mg/kg) or potentiated (2.5 mg/kg) these effects. PMID:1819921

  18. Genetic diversity at the Dhn3 locus in Turkish Hordeum spontaneum populations with comparative structural analyses

    PubMed Central

    Uçarlı, Cüneyt; McGuffin, Liam J.; Çaputlu, Süleyman; Aravena, Andres; Gürel, Filiz

    2016-01-01

    We analysed Hordeum spontaneum accessions from 21 different locations to understand the genetic diversity of HsDhn3 alleles and effects of single base mutations on the intrinsically disordered structure of the resulting polypeptide (HsDHN3). HsDHN3 was found to be YSK2-type with a low-frequency 6-aa deletion in the beginning of Exon 1. There is relatively high diversity in the intron region of HsDhn3 compared to the two exon regions. We have found subtle differences in K segments led to changes in amino acids chemical properties. Predictions for protein interaction profiles suggest the presence of a protein-binding site in HsDHN3 that coincides with the K1 segment. Comparison of DHN3 to closely related cereals showed that all of them contain a nuclear localization signal sequence flanking to the K1 segment and a novel conserved region located between the S and K1 segments [E(D/T)DGMGGR]. We found that H. vulgare, H. spontaneum, and Triticum urartu DHN3s have a greater number of phosphorylation sites for protein kinase C than other cereal species, which may be related to stress adaptation. Our results show that the nature and extent of mutations in the conserved segments of K1 and K2 are likely to be key factors in protection of cells. PMID:26869072

  19. Dynamic partial reconfiguration implementation of the SVM/KNN multi-classifier on FPGA for bioinformatics application.

    PubMed

    Hussain, Hanaa M; Benkrid, Khaled; Seker, Huseyin

    2015-08-01

    Bioinformatics data tend to be highly dimensional in nature thus impose significant computational demands. To resolve limitations of conventional computing methods, several alternative high performance computing solutions have been proposed by scientists such as Graphical Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The latter have shown to be efficient and high in performance. In recent years, FPGAs have been benefiting from dynamic partial reconfiguration (DPR) feature for adding flexibility to alter specific regions within the chip. This work proposes combing the use of FPGAs and DPR to build a dynamic multi-classifier architecture that can be used in processing bioinformatics data. In bioinformatics, applying different classification algorithms to the same dataset is desirable in order to obtain comparable, more reliable and consensus decision, but it can consume long time when performed on conventional PC. The DPR implementation of two common classifiers, namely support vector machines (SVMs) and K-nearest neighbor (KNN) are combined together to form a multi-classifier FPGA architecture which can utilize specific region of the FPGA to work as either SVM or KNN classifier. This multi-classifier DPR implementation achieved at least ~8x reduction in reconfiguration time over the single non-DPR classifier implementation, and occupied less space and hardware resources than having both classifiers. The proposed architecture can be extended to work as an ensemble classifier. PMID:26738068

  20. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    NASA Astrophysics Data System (ADS)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  1. Comparative Study of Structural Damage Under Irradiation in SiC Nano-structured and Conventional Ceramics

    SciTech Connect

    Leconte, Yann; Herlin-Boime, Nathalie; Reynaud, Cecile; Thome, Lionel

    2008-07-01

    In the context of research on new materials for next generation nuclear reactors, it becomes more and more interesting to know what can be the advantages of nano-structured materials for such applications. In this study, we performed irradiation experiments on micro-structured and nano-structured {beta}-SiC samples, with 95 MeV Xe and 4 MeV Au ions. The structure of the samples was characterized before and after irradiation by grazing incidence X-ray diffraction and Raman spectroscopy. The results showed the occurrence of a synergy between electronic and nuclear energy loss in both samples with 95 MeV Xe ions, while the nano-structured pellet was found to have a better resistance to the irradiation with 4 MeV Au ions. (authors)

  2. Comparing Coarray Fortran (CAF) with MPI for several structured mesh PDE applications

    NASA Astrophysics Data System (ADS)

    Garain, Sudip; Balsara, Dinshaw S.; Reid, John

    2015-09-01

    Language-based approaches to parallelism have been incorporated into the Fortran standard. These Fortran extensions go under the name of Coarray Fortran (CAF) and full-featured compilers that support CAF have become available from Cray and Intel; the GNU implementation is expected in 2015. CAF combines elegance of expression with simplicity of implementation to yield an efficient parallel programming language. Elegance of expression results in very compact parallel code. The existence of a standard helps with portability and maintainability. CAF was designed to excel at one-sided communication and similar functions that support one-sided communication are also available in the recent MPI-3 standard. One-sided communication is expected to be very valuable for structured mesh applications involving partial differential equations, amongst other possible applications. This paper focuses on a comparison of CAF and MPI for a few very useful applications areas that are routinely used for solving partial differential equations on structured meshes. The three specific areas are Fast Fourier Techniques, Computational Fluid Dynamics, and Multigrid Methods. For each of those applications areas, we have developed optimized CAF code and optimized MPI code that is based on the one-sided messaging capabilities of MPI-3. Weak scalability studies that compare CAF and MPI-3 are presented on up to 65,536 processors. Both paradigms scale well, showing that they are well-suited for Petascale-class applications. Some of the applications shown (like Fast Fourier Techniques and Computational Fluid Dynamics) require large, coarse-grained messaging. Such applications emphasize high bandwidth. Our other application (Multigrid Methods) uses pointwise smoothers which require a large amount of fine-grained messaging. In such applications, a premium is placed on low latency. Our studies show that both CAF and MPI-3 offer the twin advantages of high bandwidth and low latency for messages of all

  3. MOWServ: a web client for integration of bioinformatic resources

    PubMed Central

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  4. Web services at the European Bioinformatics Institute-2009

    PubMed Central

    Mcwilliam, Hamish; Valentin, Franck; Goujon, Mickael; Li, Weizhong; Narayanasamy, Menaka; Martin, Jenny; Miyar, Teresa; Lopez, Rodrigo

    2009-01-01

    The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition to the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress. These APIs are based on Web Services (SOAP/REST) interfaces that allow users to systematically access databases and analytical tools. From the user's point of view, these Web Services provide the same functionality as the browser-based forms. However, using the APIs frees the user from web page constraints and are ideal for the analysis of large batches of data, performing text-mining tasks and the casual or systematic evaluation of mathematical models in regulatory networks. Furthermore, these services are widespread and easy to use; require no prior knowledge of the technology and no more than basic experience in programming. In the following we wish to inform of new and updated services as well as briefly describe planned developments to be made available during the course of 2009–2010. PMID:19435877

  5. Translational Bioinformatics and Healthcare Informatics: Computational and Ethical Challenges

    PubMed Central

    Sethi, Prerna; Theodos, Kimberly

    2009-01-01

    Exponentially growing biological and bioinformatics data sets present a challenge and an opportunity for researchers to contribute to the understanding of the genetic basis of phenotypes. Due to breakthroughs in microarray technology, it is possible to simultaneously monitor the expressions of thousands of genes, and it is imperative that researchers have access to the clinical data to understand the genetics and proteomics of the diseased tissue. This technology could be a landmark in personalized medicine, which will provide storage for clinical and genetic data in electronic health records (EHRs). In this paper, we explore the computational and ethical challenges that emanate from the intersection of bioinformatics and healthcare informatics research. We describe the current situation of the EHR and its capabilities to store clinical and genetic data and then discuss the Genetic Information Nondiscrimination Act. Finally, we posit that the synergy obtained from the collaborative efforts between the genomics, clinical, and healthcare disciplines has potential to enhance and promote faster and more advanced breakthroughs in healthcare. PMID:20169020

  6. MOWServ: a web client for integration of bioinformatic resources.

    PubMed

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J; Claros, M Gonzalo; Trelles, Oswaldo

    2010-07-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user's tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  7. Making sense of genomes of parasitic worms: Tackling bioinformatic challenges.

    PubMed

    Korhonen, Pasi K; Young, Neil D; Gasser, Robin B

    2016-01-01

    Billions of people and animals are infected with parasitic worms (helminths). Many of these worms cause diseases that have a major socioeconomic impact worldwide, and are challenging to control because existing treatment methods are often inadequate. There is, therefore, a need to work toward developing new intervention methods, built on a sound understanding of parasitic worms at molecular level, the relationships that they have with their animal hosts and/or the diseases that they cause. Decoding the genomes and transcriptomes of these parasites brings us a step closer to this goal. The key focus of this article is to critically review and discuss bioinformatic tools used for the assembly and annotation of these genomes and transcriptomes, as well as various post-genomic analyses of transcription profiles, biological pathways, synteny, phylogeny, biogeography and the prediction and prioritisation of drug target candidates. Bioinformatic pipelines implemented and established recently provide practical and efficient tools for the assembly and annotation of genomes of parasitic worms, and will be applicable to a wide range of other parasites and eukaryotic organisms. Future research will need to assess the utility of long-read sequence data sets for enhanced genomic assemblies, and develop improved algorithms for gene prediction and post-genomic analyses, to enable comprehensive systems biology explorations of parasitic organisms. PMID:26956711

  8. Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions

    PubMed Central

    2014-01-01

    Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads. PMID:24428920

  9. Meeting Review: Bioinformatics and Medicine – From Molecules to Humans, Virtual and Real

    PubMed Central

    2002-01-01

    The Industrialization Workshop Series aims to promote and discuss integration, automation, simulation, quality, availability and standards in the high-throughput life sciences. The main issues addressed being the transformation of bioinformatics and bioinformaticsbased drug design into a robust discipline in industry, the government, research institutes and academia. The latest workshop emphasized the influence of the post-genomic era on medicine and healthcare with reference to advanced biological systems modeling and simulation, protein structure research, protein-protein interactions, metabolism and physiology. Speakers included Michael Ashburner, Kenneth Buetow, Francois Cambien, Cyrus Chothia, Jean Garnier, Francois Iris, Matthias Mann, Maya Natarajan, Peter Murray-Rust, Richard Mushlin, Barry Robson, David Rubin, Kosta Steliou, John Todd, Janet Thornton, Pim van der Eijk, Michael Vieth and Richard Ward. PMID:18628854

  10. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    PubMed

    Ju, Feng; Zhang, Tong

    2015-11-01

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation. PMID:26451629

  11. The mechanical properties of various chemical vapor deposition diamond structures compared to the ideal single crystal

    NASA Astrophysics Data System (ADS)

    Hess, Peter

    2012-03-01

    The structural and electronic properties of the diamond lattice, leading to its outstanding mechanical properties, are discussed. These include the highest elastic moduli and fracture strength of any known material. Its extreme hardness is strongly connected with the extreme shear modulus, which even exceeds the large bulk modulus, revealing that diamond is more resistant to shear deformation than to volume changes. These unique features protect the ideal diamond lattice also against mechanical failure and fracture. Besides fast heat conduction, the fast vibrational movement of carbon atoms results in an extreme speed of sound and propagation of crack tips with comparable velocity. The ideal mechanical properties are compared with those of real diamond films, plates, and crystals, such as ultrananocrystalline (UNC), nanocrystalline, microcrystalline, and homo- and heteroepitaxial single-crystal chemical vapor deposition (CVD) diamond, produced by metastable synthesis using CVD. Ultrasonic methods have played and continue to play a dominant role in the determination of the linear elastic properties, such as elastic moduli of crystals or the Young's modulus of thin films with substantially varying impurity levels and morphologies. A surprising result of these extensive measurements is that even UNC diamond may approach the extreme Young's modulus of single-crystal diamond under optimized deposition conditions. The physical reasons for why the stiffness often deviates by no more than a factor of two from the ideal value are discussed, keeping in mind the large variety of diamond materials grown by various deposition conditions. Diamond is also known for its extreme hardness and fracture strength, despite its brittle nature. However, even for the best natural and synthetic diamond crystals, the measured critical fracture stress is one to two orders of magnitude smaller than the ideal value obtained by ab initio calculations for the ideal cubic lattice. Currently

  12. Comparative study of porous limestones used in heritage structures in Cyprus and in Hungary

    NASA Astrophysics Data System (ADS)

    Theodoridou, Magdalini; Ioannou, Ioannis; Rozgonyi-Boissinot, Nikoletta; Török, Ákos

    2015-04-01

    Porous limestone is widely used as construction material in the monuments of Cyprus and Hungary. The present study compares the physical properties of a bioclastic limestone from Cyprus and an oolitic limestone from Hungary. Petra Gerolakkou is a Pliocene limestone from Cyprus that originates from the district of Nicosia, the island's capital. It has been extensively used throughout the years in construction and restoration projects, particularly in the Nicosia area. Distinctive examples of its use can be found in the majority of the most important historic monuments in Nicosia, such as the Venetian walls and fortifications, churches (e.g. the Agia Sofia Cathedral), the archbishop and presidential palaces and a high number of other traditional buildings. The studied Miocene limestone from Hungary was exploited from Sóskút quarry (15-20 km W-SW to Budapest). The quarry provided stone for emblematic monuments of the capital of Hungary such as the Parliament building, Mathias Church, the Opera House and Citadella. In this study, mechanical parameters for both aforementioned stones, such as uniaxial compressive and tensile strengths, were tested under laboratory conditions. Their density, porosity and water absorption were also compared. The studied limestone from Cyprus exhibits porosity values within the range of 48-51%, apparent density between 1340 and 1400 kg/m3 and strength values under uniaxial compressive load between 1.2 and 2.8 MPa. This lithotype is also considered susceptible to salt decay, since an approximate mass loss of 12.5% is noted after 15 salt crystallization artificial weathering cycles. The porosity of the Hungarian limestone is in the order of 16-35%, the bulk density is 1600-1950 kg/m3, while the compressive strength is 2.5-15 MPa. Durability tests indicate that even after 10 freeze-thaw cycles the loss in strength is dramatic. Test results indicate that use of porous limestone in both countries is common and fabric strongly controls the

  13. Ensemble-based evaluation for protein structure models

    PubMed Central

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2016-01-01

    Motivation: Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. Results: We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts’ intuitive assessment of computational models and provides information of practical usefulness of models. Availability and implementation: https://bitbucket.org/mjamroz/flexscore Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307633

  14. Structure of parasite component communities of didelphid marsupials: insights from a comparative study.

    PubMed

    Jiménez, F Agustín; Catzeflis, François; Gardner, Scott L

    2011-10-01

    The parasite fauna of the gray four-eyed opossum, Philander opossum (Linnaeus, 1758), and the common opossum, Didelphis marsupialis Linnaeus, 1758, in Camp du Tigre, French Guiana, is characterized. Nine species from the gastrointestinal system were recovered from both species, which shared 80% of their parasites. The parasite fauna comprised several monoxenous species (63%) and was dominated by Aspidodera raillieti Travassos, 1914, which exhibited high levels of prevalence and abundance in both communities. Only 2 species (Moennigia sp. and Spirura guianensis) had been recorded in other species of mammals. Both species richness and taxonomic composition at the level of component communities from this locality were compared against 11 communities present in the Virginia ( Didelphis virginiana ), white-bellied (Didelphis albiventris), and common opossum from Argentina, Brazil, Mexico, and the United States. Neither host phylogeny nor taxonomy accounted for statistical differences in species richness. There was no statistical difference among species richness values among the 9 localities studied. Taxonomic similarity was analyzed by means of the Jaccard's similarity index, including all, and only common species (occurring in prevalence >10%). The results suggest that sympatric species of marsupials share more species of parasites than parasite communities occurring in conspecific marsupials from different localities. As a consequence, taxonomic composition of these parasite communities varied depending on the locality. Probably, marsupials of the monophyletic Didelphini offer the same compatibility toward their parasites, by presenting them with similar habitats. Subtle differences in lifestyles of the marsupials may determine the chance of encounter between the symbionts and prevent some parasites from completing their life cycles. Further and more rigorous tests are necessary to determine the roles of encounter and compatibility filters, as well as the role of

  15. Developing sustainable software solutions for bioinformatics by the “ Butterfly” paradigm

    PubMed Central

    Ahmed, Zeeshan; Zeeshan, Saman; Dandekar, Thomas

    2014-01-01

    Software design and sustainable software engineering are essential for the long-term development of bioinformatics software. Typical challenges in an academic environment are short-term contracts, island solutions, pragmatic approaches and loose documentation. Upcoming new challenges are big data, complex data sets, software compatibility and rapid changes in data representation. Our approach to cope with these challenges consists of iterative intertwined cycles of development (“ Butterfly” paradigm) for key steps in scientific software engineering. User feedback is valued as well as software planning in a sustainable and interoperable way. Tool usage should be easy and intuitive. A middleware supports a user-friendly Graphical User Interface (GUI) as well as a database/tool development independently. We validated the approach of our own software development and compared the different design paradigms in various software solutions. PMID:25383181

  16. Integrating genomics, proteomics and bioinformatics in translational studies of molecular medicine.

    PubMed

    Ostrowski, Jerzy; Wyrwicz, Lucjan S

    2009-09-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, which are usually applied on the scale of single genes. Medicine in the postgenomic era will utilize thousands of disease-associated molecular markers provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical and bioinformatic analyses to model biological systems. Collecting, cataloging and comparing data from molecular studies, and the subsequent development of conclusions, creates the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm known as integrative genomics. PMID:19732006

  17. Microsatellites for next-generation ecologists: a post-sequencing bioinformatics pipeline.

    PubMed

    Fernandez-Silva, Iria; Whitney, Jonathan; Wainwright, Benjamin; Andrews, Kimberly R; Ylitalo-Ward, Heather; Bowen, Brian W; Toonen, Robert J; Goetze, Erica; Karl, Stephen A

    2013-01-01

    Microsatellites are the markers of choice for a variety of population genetic studies. The recent advent of next-generation pyrosequencing has drastically accelerated microsatellite locus discovery by providing a greater amount of DNA sequencing reads at lower costs compared to other techniques. However, laboratory testing of PCR primers targeting potential microsatellite markers remains time consuming and costly. Here we show how to reduce this workload by screening microsatellite loci via bioinformatic analyses prior to primer design. Our method emphasizes the importance of sequence quality, and we avoid loci associated with repetitive elements by screening with repetitive sequence databases available for a growing number of taxa. Testing with the Yellowstripe Goatfish Mulloidichthys flavolineatus and the marine planktonic copepod Pleuromamma xiphias we show higher success rate of primers selected by our pipeline in comparison to previous in silico microsatellite detection methodologies. Following the same pipeline, we discover and select microsatellite loci in nine additional species including fishes, sea stars, copepods and octopuses. PMID:23424642

  18. Microsatellites for Next-Generation Ecologists: A Post-Sequencing Bioinformatics Pipeline

    PubMed Central

    Fernandez-Silva, Iria; Whitney, Jonathan; Wainwright, Benjamin; Andrews, Kimberly R.; Ylitalo-Ward, Heather; Bowen, Brian W.; Toonen, Robert J.; Goetze, Erica; Karl, Stephen A.

    2013-01-01

    Microsatellites are the markers of choice for a variety of population genetic studies. The recent advent of next-generation pyrosequencing has drastically accelerated microsatellite locus discovery by providing a greater amount of DNA sequencing reads at lower costs compared to other techniques. However, laboratory testing of PCR primers targeting potential microsatellite markers remains time consuming and costly. Here we show how to reduce this workload by screening microsatellite loci via bioinformatic analyses prior to primer design. Our method emphasizes the importance of sequence quality, and we avoid loci associated with repetitive elements by screening with repetitive sequence databases available for a growing number of taxa. Testing with the Yellowstripe Goatfish Mulloidichthys flavolineatus and the marine planktonic copepod Pleuromamma xiphias we show higher success rate of primers selected by our pipeline in comparison to previous in silico microsatellite detection methodologies. Following the same pipeline, we discover and select microsatellite loci in nine additional species including fishes, sea stars, copepods and octopuses. PMID:23424642

  19. Graph kernels, hierarchical clustering, and network community structure: experiments and comparative analysis

    NASA Astrophysics Data System (ADS)

    Zhang, S.; Ning, X.-M.; Zhang, X.-S.

    2007-05-01

    There has been a quickly growing interest in properties of complex networks, such as the small world property, power-law degree distribution, network transitivity, and community structure, which seem to be common to many real world networks. In this study, we consider the community property which is also found in many real networks. Based on the diffusion kernels of networks, a hierarchical clustering approach is proposed to uncover the community structure of different extent of complex networks. We test the method on some networks with known community structures and find that it can detect significant community structure in these networks. Comparison with related methods shows the effectiveness of the method.

  20. Comparing composite materials with structural steels in the design of the optical support structure of very large telescopes

    NASA Astrophysics Data System (ADS)

    Cheng, Andrew Y.; Li, Robert K.

    1992-03-01

    The method of finite element analysis is used to study some candidate composite materials: carbon filter reinforced epoxy and glass fiber reinforced epoxy. These composites may have real applications in the design of the optical support structures of very large telescopes where stringent thermomechanical stability are needed. The lightweight property of these materials allows one to build very stiff members for the optical support to withstand the structural deflections due to wind, vibration, and gravity. We have run finite element models of these composites using ABAQUS on a VAX VMS computer. Simple beams with rectangular cross- sections were computed for the composites with structural steel as a comparison. The static properties of these beams were studied.