Science.gov

Sample records for comparative structural bioinformatics

  1. A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Family Ectodomain Based on Phylogenetic Information

    PubMed Central

    Rentería, Miguel E.; Gandhi, Neha S.; Vinuesa, Pablo; Helmerhorst, Erik; Mancera, Ricardo L.

    2008-01-01

    The insulin receptor (IR), the insulin-like growth factor 1 receptor (IGF1R) and the insulin receptor-related receptor (IRR) are covalently-linked homodimers made up of several structural domains. The molecular mechanism of ligand binding to the ectodomain of these receptors and the resulting activation of their tyrosine kinase domain is still not well understood. We have carried out an amino acid residue conservation analysis in order to reconstruct the phylogeny of the IR Family. We have confirmed the location of ligand binding site 1 of the IGF1R and IR. Importantly, we have also predicted the likely location of the insulin binding site 2 on the surface of the fibronectin type III domains of the IR. An evolutionary conserved surface on the second leucine-rich domain that may interact with the ligand could not be detected. We suggest a possible mechanical trigger of the activation of the IR that involves a slight ‘twist’ rotation of the last two fibronectin type III domains in order to face the likely location of insulin. Finally, a strong selective pressure was found amongst the IRR orthologous sequences, suggesting that this orphan receptor has a yet unknown physiological role which may be conserved from amphibians to mammals. PMID:18989367

  2. Teaching Structural Bioinformatics at the Undergraduate Level

    ERIC Educational Resources Information Center

    Centeno, Nuria B.; Villa-Freixa, Jordi; Oliva, Baldomero

    2003-01-01

    Understanding the basic principles of structural biology is becoming a major subject of study in most undergraduate level programs in biology. In the genomic and proteomic age, it is becoming indispensable for biology students to master concepts related to the sequence and structure of proteins in order to develop skills that may be useful in a…

  3. A multi-species comparative structural bioinformatics analysis of inherited mutations in α-D-Mannosidase reveals strong genotype-phenotype correlation

    PubMed Central

    2009-01-01

    Background Lysosomal α-mannosidase is an enzyme that acts to degrade N-linked oligosaccharides and hence plays an important role in mannose metabolism in humans and other mammalian species, especially livestock. Mutations in the gene (MAN2B1) encoding lysosomal α-D-mannosidase cause improper coding, resulting in dysfunctional or non-functional protein, causing the disease α-mannosidosis. Mapping disease mutations to the structure of the protein can help in understanding the functional consequences of these mutations and thus indirectly, the finer aspects of the pathology and clinical manifestations of the disease, including phenotypic severity as a function of the genotype. Results A comprehensive homology modeling study of all the wild-type and inherited mutations of lysosomal α-mannosidase in four different species, human, cow, cat and guinea pig, reveals a significant correlation between the severity of the genotype and the phenotype in α-mannosidosis. We used the X-ray crystallographic structure of bovine lysosomal α-mannosidase as template, containing only two disulphide bonds and some ligands, to build structural models of wild-type structures with four disulfide linkages and all bound ligands. These wild-type models were then used as templates for disease mutations. All the truncations and substitutions involving the residues in and around the active site and those that destabilize the fold led to severe genotypes resulting in lethal phenotypes, whereas the mutations lying away from the active site were milder in both their genotypic and phenotypic expression. Conclusion Based on the co-location of mutations from different organisms and their proximity to the enzyme active site, we have extrapolated observed mutations from one species to homologous positions in other organisms, as a predictive approach for detecting likely α-mannosidosis. Besides predicting new disease mutations, this approach also provides a way for detecting mutation hotspots in the

  4. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  5. Observation selection bias in contact prediction and its implications for structural bioinformatics

    PubMed Central

    Orlando, G.; Raimondi, D.; Vranken, W. F.

    2016-01-01

    Next Generation Sequencing is dramatically increasing the number of known protein sequences, with related experimentally determined protein structures lagging behind. Structural bioinformatics is attempting to close this gap by developing approaches that predict structure-level characteristics for uncharacterized protein sequences, with most of the developed methods relying heavily on evolutionary information collected from homologous sequences. Here we show that there is a substantial observational selection bias in this approach: the predictions are validated on proteins with known structures from the PDB, but exactly for those proteins significantly more homologs are available compared to less studied sequences randomly extracted from Uniprot. Structural bioinformatics methods that were developed this way are thus likely to have over-estimated performances; we demonstrate this for two contact prediction methods, where performances drop up to 60% when taking into account a more realistic amount of evolutionary information. We provide a bias-free dataset for the validation for contact prediction methods called NOUMENON. PMID:27857150

  6. FLAGdb(++): A Bioinformatic Environment to Study and Compare Plant Genomes.

    PubMed

    Tamby, Jean Philippe; Brunaud, Véronique

    2017-01-01

    Today, the growing knowledge and data accumulation on plant genomes do not solve in a simple way the task of gene function inference. Because data of different types are coming from various sources, we need to integrate and analyze them to help biologists in this task. We created FLAGdb(++) ( http://tools.ips2.u-psud.fr/FLAGdb ) to take up this challenge for a selection of plant genomes. In order to enrich gene function predictions, structural and functional annotations of the genomes are explored to generate meta-data and to compare them. Since data are numerous and complex, we focused on accessibility and visualization with an original and user-friendly interface. In this chapter we present the main tools of FLAGdb(++) and a use-case to explore a gene family: structural and functional properties of this family and research of orthologous genes in the other plant genomes.

  7. Achievements and challenges in structural bioinformatics and computational biophysics

    PubMed Central

    Samish, Ilan; Bourne, Philip E.; Najmanovich, Rafael J.

    2015-01-01

    Motivation: The field of structural bioinformatics and computational biophysics has undergone a revolution in the last 10 years. Developments that are captured annually through the 3DSIG meeting, upon which this article reflects. Results: An increase in the accessible data, computational resources and methodology has resulted in an increase in the size and resolution of studied systems and the complexity of the questions amenable to research. Concomitantly, the parameterization and efficiency of the methods have markedly improved along with their cross-validation with other computational and experimental results. Conclusion: The field exhibits an ever-increasing integration with biochemistry, biophysics and other disciplines. In this article, we discuss recent achievements along with current challenges within the field. Contact: Rafael.Najmanovich@USherbrooke.ca PMID:25488929

  8. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    ERIC Educational Resources Information Center

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  9. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    PubMed

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  10. Recent progress on structural bioinformatics research of cytochrome P450 and its impact on drug discovery.

    PubMed

    Zhang, Tao; Wei, Dongqing

    2015-01-01

    Cytochrome P450 is predominantly responsible for human drug metabolism, which is of critical importance for drug discovery and development. Structural bioinformatics focuses on analysis and prediction of three-dimentional structure of biological macromolecules and elucidation of structure-function relationship as well as identification of important binding interactions. Rapid advancement of structural bioinformatics has been made over the last decade. With more information available for CYP structures, the methods of structural bioinformatics may be used in the CYP field. In this review, we demonstrate three previous studies on CYP using the methods of structural bioinformatics, including the investigation of reasons for decrease of enzymatic activity of CYP1A2 caused by a peripheral mutation, the construction of a pharmacophore model specific to active site of CYP1A2 and the prediction of the functional consequences of single residue mutation in CYP. By illustrating these studies we attempt to show the potential role of structural bioinformatics in CYP research and help better understanding the importance of structural bioinformatics in drug designing.

  11. Computer programming and biomolecular structure studies: A step beyond internet bioinformatics.

    PubMed

    Likić, Vladimir A

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled Biomolecular Structure and Bioinformatics. Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics approach that relies on access to the Internet and biological databases. This was an ambitious approach considering that the students mostly had a biological background. There were also time constraints of eight lectures in total and two accompanying practical sessions. The main challenge was that students had to be introduced to computer programming from a beginner level and in a short time provided with enough knowledge to independently solve a simple bioinformatics problem. This was accomplished with a problem directly relevant to the rest of the subject, concerned with the structure-function relationships and experimental techniques for the determination of macromolecular structure.

  12. Structural and evolutionary bioinformatics of the SPOUT superfamily of methyltransferases

    PubMed Central

    Tkaczuk, Karolina L; Dunin-Horkawicz, Stanislaw; Purta, Elzbieta; Bujnicki, Janusz M

    2007-01-01

    Background SPOUT methyltransferases (MTases) are a large class of S-adenosyl-L-methionine-dependent enzymes that exhibit an unusual alpha/beta fold with a very deep topological knot. In 2001, when no crystal structures were available for any of these proteins, Anantharaman, Koonin, and Aravind identified homology between SpoU and TrmD MTases and defined the SPOUT superfamily. Since then, multiple crystal structures of knotted MTases have been solved and numerous new homologous sequences appeared in the databases. However, no comprehensive comparative analysis of these proteins has been carried out to classify them based on structural and evolutionary criteria and to guide functional predictions. Results We carried out extensive searches of databases of protein structures and sequences to collect all members of previously identified SPOUT MTases, and to identify previously unknown homologs. Based on sequence clustering, characterization of domain architecture, structure predictions and sequence/structure comparisons, we re-defined families within the SPOUT superfamily and predicted putative active sites and biochemical functions for the so far uncharacterized members. We have also delineated the common core of SPOUT MTases and inferred a multiple sequence alignment for the conserved knot region, from which we calculated the phylogenetic tree of the superfamily. We have also studied phylogenetic distribution of different families, and used this information to infer the evolutionary history of the SPOUT superfamily. Conclusion We present the first phylogenetic tree of the SPOUT superfamily since it was defined, together with a new scheme for its classification, and discussion about conservation of sequence and structure in different families, and their functional implications. We identified four protein families as new members of the SPOUT superfamily. Three of these families are functionally uncharacterized (COG1772, COG1901, and COG4080), and one (COG1756

  13. The role of structural bioinformatics resources in the era of integrative structural biology

    PubMed Central

    Gutmanas, Aleksandras; Oldfield, Thomas J.; Patwardhan, Ardan; Sen, Sanchayita; Velankar, Sameer; Kleywegt, Gerard J.

    2013-01-01

    The history and the current state of the PDB and EMDB archives is briefly described, as well as some of the challenges that they face. It seems natural that the role of structural biology archives will change from being a pure repository of historic data into becoming an indispensable resource for the wider biomedical community. As part of this transformation, it will be necessary to validate the biomacromolecular structure data and ensure the highest possible quality for the archive holdings, to combine structural data from different spatial scales into a unified resource and to integrate structural data with functional, genetic and taxonomic data as well as other information available in bioinformatics resources. Some recent developments and plans to address these challenges at PDBe are presented. PMID:23633580

  14. Bioinformatics and variability in drug response: a protein structural perspective

    PubMed Central

    Lahti, Jennifer L.; Tang, Grace W.; Capriotti, Emidio; Liu, Tianyun; Altman, Russ B.

    2012-01-01

    Marketed drugs frequently perform worse in clinical practice than in the clinical trials on which their approval is based. Many therapeutic compounds are ineffective for a large subpopulation of patients to whom they are prescribed; worse, a significant fraction of patients experience adverse effects more severe than anticipated. The unacceptable risk–benefit profile for many drugs mandates a paradigm shift towards personalized medicine. However, prior to adoption of patient-specific approaches, it is useful to understand the molecular details underlying variable drug response among diverse patient populations. Over the past decade, progress in structural genomics led to an explosion of available three-dimensional structures of drug target proteins while efforts in pharmacogenetics offered insights into polymorphisms correlated with differential therapeutic outcomes. Together these advances provide the opportunity to examine how altered protein structures arising from genetic differences affect protein–drug interactions and, ultimately, drug response. In this review, we first summarize structural characteristics of protein targets and common mechanisms of drug interactions. Next, we describe the impact of coding mutations on protein structures and drug response. Finally, we highlight tools for analysing protein structures and protein–drug interactions and discuss their application for understanding altered drug responses associated with protein structural variants. PMID:22552919

  15. Integrating structure, bioinformatics and enzymology to discover function : BioH, a new carboxylesterase from E. coli.

    SciTech Connect

    Sanishvili, R.; Yakunin, A. F.; Laskowski, R. A.; Skarina, T.; Evdokimova, E.; Doherty-Kirby, A.; Lajoie, G. A.; Thornton, J. M.; Arrowsmith, C. H.; Savchenko, A.; Joachimiak, A.; Edwards, A. M.; Univ. of Toronto; Clinical Genomics Centre European Bioinformatics Inst.; Univ. of Western Ontario

    2003-07-11

    Structural proteomics projects are generating three-dimensional structures of novel, uncharacterized proteins at an increasing rate. However, structure alone is often insufficient to deduce the specific biochemical function of a protein. Here we determined the function for a protein using a strategy that integrates structural and bioinformatics data with parallel experimental screening for enzymatic activity. BioH is involved in biotin biosynthesis in Escherichia coli and had no previously known biochemical function. The crystal structure of BioH was determined at 1.7 {angstrom} resolution. An automated procedure was used to compare the structure of BioH with structural templates from a variety of different enzyme active sites. This screen identified a catalytic triad (Ser{sup 82}, His{sup 235}, and Asp{sup 207}) with a configuration similar to that of the catalytic triad of hydrolases. Analysis of BioH with a panel of hydrolase assays revealed a carboxylesterase activity with a preference for short acyl chain substrates. The combined use of structural bioinformatics with experimental screens for detecting enzyme activity could greatly enhance the rate at which function is determined from structure.

  16. E-MSD: the European Bioinformatics Institute Macromolecular Structure Database

    PubMed Central

    Boutselakis, H.; Dimitropoulos, D.; Fillon, J.; Golovin, A.; Henrick, K.; Hussain, A.; Ionides, J.; John, M.; Keller, P. A.; Krissinel, E.; McNeil, P.; Naim, A.; Newman, R.; Oldfield, T.; Pineda, J.; Rachedi, A.; Copeland, J.; Sitnov, A.; Sobhany, S.; Suarez-Uruena, A.; Swaminathan, J.; Tagari, M.; Tate, J.; Tromm, S.; Velankar, S.; Vranken, W.

    2003-01-01

    The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool. PMID:12520052

  17. [Research thoughts on structural components of Chinese medicine combined with bioinformatics].

    PubMed

    Wang, Cheng-cheng; Feng, Liang; Liu, Dan; Cui, Li; Tan, Xiao-bin; Jia, Xiao-bin

    2015-11-01

    Traditional Chinese medicine(TCM) is a complex system, featured with integrity and characteristics. Structural component TCM is a well-organized integrity of traditional Chinese medicine, reflecting multi-component integration effect of TCM. It gives us a new view on the material basis of TCM. Currently, conventional researching strategies are not enough to deal with the relationship between material basis and efficacy, multi-composition, multi-targets, and multi-section mechanism. Post-genome area gives a birth to bioinformatics, which involves systematic biology, different levels of omics, corresponding mathematics and computer techniques. It increasingly becomes a powerful tool to understand complicated system and life essential laws. Research ideas, methods. and knowledge of data mining technology of bioinformatics combined with the theory of structural components of Chinese medicine bring a new opportunity for developing structural components of Chinese medicine, systematically exploring the essence of TCM and promoting the modernization of TCM.

  18. Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery

    PubMed Central

    Blundell, Tom L; Sibanda, Bancinyane L; Montalvão, Rinaldo Wander; Brewerton, Suzanne; Chelliah, Vijayalakshmi; Worth, Catherine L; Harmer, Nicholas J; Davies, Owen; Burke, David

    2006-01-01

    Impressive progress in genome sequencing, protein expression and high-throughput crystallography and NMR has radically transformed the opportunities to use protein three-dimensional structures to accelerate drug discovery, but the quantity and complexity of the data have ensured a central place for informatics. Structural biology and bioinformatics have assisted in lead optimization and target identification where they have well established roles; they can now contribute to lead discovery, exploiting high-throughput methods of structure determination that provide powerful approaches to screening of fragment binding. PMID:16524830

  19. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    ERIC Educational Resources Information Center

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  20. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    PubMed Central

    2002-01-01

    Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit. PMID:12401134

  1. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    PubMed Central

    Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun

    2014-01-01

    While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285

  2. An Introductory Bioinformatics Exercise to Reinforce Gene Structure and Expression and Analyze the Relationship between Gene and Protein Sequences

    ERIC Educational Resources Information Center

    Almeida, Craig A.; Tardiff, Daniel F.; De Luca, Jane P.

    2004-01-01

    We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product.…

  3. Comparative bioinformatics, temporal and spatial expression analyses of Ixodes scapularis organic anion transporting polypeptides

    PubMed Central

    Radulović, Željko; Porter, Lindsay M.; Kim, Tae K.; Mulenga, Albert

    2015-01-01

    Organic anion-transporting polypeptides (Oatps) are an integral part of the detoxification mechanism in vertebrates and invertebrates. These cell surface proteins are involved in mediating the sodium-independent uptake and/or distribution of a broad array of organic amphipathic compounds and xenobiotic drugs. This study describes bioinformatics and biological characterization of 9 Oatp sequences in the Ixodes scapularis genome. These sequences have been annotated on the basis of 12 transmembrane domains, consensus motif D-X-RW-(I,V)-GAWW-X-G-(F,L)-L, and 11 conserved cysteine amino acid residues in the large extracellular loop 5 that characterize the Oatp superfamily. Ixodes scapularis Oatps may regulate non-redundant cross-tick species conserved functions in that they did not cluster as a monolithic group on the phylogeny tree and that they have orthologs in other ticks. Phylogeny clustering patterns also suggest that some tick Oatp sequences transport substrates that are similar to those of body louse, mosquito, eye worm, and filarial worm Oatps. Semi-quantitative RT-PCR analysis demonstrated that all 9 I. scapularis Oatp sequences were expressed during tick feeding. Ixodes scapularis Oatp genes potentially regulate functions during early and/or late-stage tick feeding as revealed by normalized mRNA profiles. Normalized transcript abundance indicates that I. scapularis Oatp genes are strongly expressed in unfed ticks during the first 24 h of feeding and/or at the end of the tick feeding process. Except for 2 I. scapularis Oatps, which were expressed in the salivary glands and ovaries, all other genes were expressed in all tested organs, suggesting the significance of I. scapularis Oatps in maintaining tick homeostasis. Different I. scapularis Oatp mRNA expression patterns were detected and discussed with reference to different physiological states of unfed and feeding ticks. PMID:24582512

  4. Determination of Lipid-Protein Interactions in Lung Surfactants Using Computer Simulations and Structural Bioinformatics.

    NASA Astrophysics Data System (ADS)

    Kaznessis, Yiannis

    2001-06-01

    Proteins are the primary components of the networks that conduct the flows of mass, energy and information in living organisms. The discovery of the principles of protein structure and function allows the development of design rules for biological activities. The microscopic nature of the operating mechanisms of protein activity, and the vast complexity of the networks of interaction call for the employment of powerful computational methodologies that can decipher the physicochemical and evolutionary principles underlying protein structure and function. An example will be presented that reflects the strength of computational approaches. Atomistic molecular dynamics simulations and structural bioinformatics tools are employed to investigate the interactions between the first 25 N-terminal residues of surfactant protein B (SP-B 1-25) and the lipid components of the lung surfactant (LS). An understanding of the molecular level interactions between the LS components is essential for the establishment of design rules for the development of synthetic LS and the treatment of the neonatal respiratory distress syndrome, which results from deficiency or inactivation of LS.

  5. Predicting the impact of deleterious single point mutations in SMAD gene family using structural bioinformatics approach.

    PubMed

    George Priya Doss, C; Nagasundaram, N; Tanwar, Himani

    2012-06-01

    Functional alteration in SMAD proteins leads to dis-regulation of its mechanism results in possibilities of high risk diseases like fibrosis, cancer, juvenile polyposis etc. Studying single nucleotide polymorphism (SNP) in SMAD genes helps understand the malfunction of these proteins. In this study, we focused on deleterious effects of nsSNPs in both structural and functional level using publically available bioinformatics tools. We have mainly focused on identifying deleterious nsSNPs in both structural and functional level in SMAD genes by using SIFT, PolyPhen, SNPs&GO, I-Mutant 3.0, MUpro and PANTHER. Structure analysis was carried out with the major mutation that occurred in the native protein coded by SMAD genes and its amino acid positions (R358W, K306S, R310G, S433R and R361C). SRide was used to check the stability of the native and mutant modelled proteins. In addition, we used MAPPER to identify SNPs present in transcription factor binding sites. These findings demonstrate that the in silico approaches can be used efficiently to identify potential candidate SNPs in large scale analysis.

  6. Structural, Bioinformatic, and In Vivo Analyses of Two Treponema pallidum Lipoproteins Reveal a Unique TRAP Transporter

    SciTech Connect

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-05-25

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP-independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP), and tp0958 (the symporter), are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of 'tetratricopeptide repeat' (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPATs (TPR-protein-associated TRAP-Ts) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s).

  7. Structural, bioinformatic, and in vivo analyses of two Treponema pallidum lipoproteins reveal a unique TRAP transporter

    PubMed Central

    Deka, Ranjit K.; Brautigam, Chad A.; Goldberg, Martin; Schuck, Peter; Tomchick, Diana R.; Norgard, Michael V.

    2012-01-01

    Treponema pallidum, the bacterial agent of syphilis, is predicted to encode one tripartite ATP- independent periplasmic transporter (TRAP-T). TRAP-Ts typically employ a periplasmic substrate-binding protein (SBP) to deliver the cognate ligand to the transmembrane symporter. Herein, we demonstrate that the genes encoding the putative TRAP-T components from T. pallidum, tp0957 (the SBP) and tp0958 (the symporter) are in an operon with an uncharacterized third gene, tp0956. We determined the crystal structure of recombinant Tp0956; the protein is trimeric and perforated by a pore. Part of Tp0956 forms an assembly similar to those of “tetratricopeptide repeat” (TPR) motifs. The crystal structure of recombinant Tp0957 was also determined; like the SBPs of other TRAP-Ts, there are two lobes separated by a cleft. In these other SBPs, the cleft binds a negatively charged ligand. However, the cleft of Tp0957 has a strikingly hydrophobic chemical composition, indicating that its ligand may be substantially different and likely hydrophobic. Analytical ultracentrifugation of the recombinant versions of Tp0956 and Tp0957 established that these proteins associate avidly. This unprecedented interaction was confirmed for the native molecules using in vivo cross-linking experiments. Finally, bioinformatic analyses suggested that this transporter exemplifies a new subfamily of TPR-protein associated TRAP transporters (TPATs) that require the action of a TPR-containing accessory protein for the periplasmic transport of a potentially hydrophobic ligand(s). PMID:22306465

  8. Bioinformatics Study of Structural Patterns in Plant MicroRNA Precursors

    PubMed Central

    Tomczyk, K.; Mickiewicz, A.; Sarzynska, J.

    2017-01-01

    According to the RNA world theory, RNAs which stored genetic information and catalyzed chemical reactions had their contribution in the formation of current living organisms. In recent years, researchers studied this molecule diversity, i.a. focusing on small non-coding regulatory RNAs. Among them, of particular interest is evolutionarily ancient, 19–24 nt molecule of microRNA (miRNA). It has been already recognized as a regulator of gene expression in eukaryotes. In plants, miRNA plays a key role in the response to stress conditions and it participates in the process of growth and development. MicroRNAs originate from primary transcripts (pri-miRNA) encoded in the nuclear genome. They are processed from single-stranded stem-loop RNA precursors containing hairpin structures. While the mechanism of mature miRNA production in animals is better understood, its biogenesis in plants remains less clear. Herein, we present the results of bioinformatics analysis aimed at discovering how plant microRNAs are recognized within their precursors (pre-miRNAs). The study has been focused on sequential and structural motif identification in the neighbourhood of microRNA. PMID:28280737

  9. Edge Bioinformatics

    SciTech Connect

    Lo, Chien-Chi

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance

  10. Analysis of RNAseq datasets from a comparative infectious disease zebrafish model using GeneTiles bioinformatics.

    PubMed

    Veneman, Wouter J; de Sonneville, Jan; van der Kolk, Kees-Jan; Ordas, Anita; Al-Ars, Zaid; Meijer, Annemarie H; Spaink, Herman P

    2015-03-01

    We present a RNA deep sequencing (RNAseq) analysis of a comparison of the transcriptome responses to infection of zebrafish larvae with Staphylococcus epidermidis and Mycobacterium marinum bacteria. We show how our developed GeneTiles software can improve RNAseq analysis approaches by more confidently identifying a large set of markers upon infection with these bacteria. For analysis of RNAseq data currently, software programs such as Bowtie2 and Samtools are indispensable. However, these programs that are designed for a LINUX environment require some dedicated programming skills and have no options for visualisation of the resulting mapped sequence reads. Especially with large data sets, this makes the analysis time consuming and difficult for non-expert users. We have applied the GeneTiles software to the analysis of previously published and newly obtained RNAseq datasets of our zebrafish infection model, and we have shown the applicability of this approach also to published RNAseq datasets of other organisms by comparing our data with a published mammalian infection study. In addition, we have implemented the DEXSeq module in the GeneTiles software to identify genes, such as glucagon A, that are differentially spliced under infection conditions. In the analysis of our RNAseq data, this has led to the possibility to improve the size of data sets that could be efficiently compared without using problem-dedicated programs, leading to a quick identification of marker sets. Therefore, this approach will also be highly useful for transcriptome analyses of other organisms for which well-characterised genomes are available.

  11. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis

    PubMed Central

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N.

    2016-01-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  12. Bioinformatics investigation of therapeutic mechanisms of Xuesaitong capsule treating ischemic cerebrovascular rat model with comparative transcriptome analysis

    PubMed Central

    Liao, Jiangquan; Wei, Benjun; Chen, Hengwen; Liu, Yongmei; Wang, Jie

    2016-01-01

    Background: Xuesaitong soft capsule (XST) which consists of panax notoginseng saponin (PNS) has been used to treat ischemic cerebrovascular diseases in China. The therapeutic mechanism of XST has not been elucidated yet from prospective of genomics and bioinformatics. Methods: A transcriptome analysis was performed to review series concerning middle cerebral artery occlusion (MCAO) rat model and XST intervention after MCAO from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were compared between blank group and model group, model group and XST group. Functional enrichment and pathway analysis were performed. Protein-Protein interaction network was constructed. The overlapping genes from two DEGs sets were screened out and profound analysis was performed. Results: Two series including 22 samples were obtained. 870 DEGs were identified between blank group and model group, and 1189 DEGs were identified between model group and XST group. GO terms and KEGG pathways of MCAO and XST intervention were significantly enriched. PPI networks were constructed to demonstrate the gene-gene interactions. The overlapping genes from two DEGs sets were highlighted. ANTXR2, FHL3, PRCP, TYROBP, TAF9B, FGFR2, BCL11B, RB1CC1 and MBNL2 were the pivotal genes and possible action sites of XST therapeutic mechanisms. Conclusion: MCAO is a pathological process with multiple. PMID:27347353

  13. Identification and Comparative Analysis of H2O2-Scavenging Enzymes (Ascorbate Peroxidase and Glutathione Peroxidase) in Selected Plants Employing Bioinformatics Approaches

    PubMed Central

    Ozyigit, Ibrahim I.; Filiz, Ertugrul; Vatansever, Recep; Kurtoglu, Kuaybe Y.; Koc, Ibrahim; Öztürk, Münir X.; Anjum, Naser A.

    2016-01-01

    Among major reactive oxygen species (ROS), hydrogen peroxide (H2O2) exhibits dual roles in plant metabolism. Low levels of H2O2 modulate many biological/physiological processes in plants; whereas, its high level can cause damage to cell structures, having severe consequences. Thus, steady-state level of cellular H2O2 must be tightly regulated. Glutathione peroxidases (GPX) and ascorbate peroxidase (APX) are two major ROS-scavenging enzymes which catalyze the reduction of H2O2 in order to prevent potential H2O2-derived cellular damage. Employing bioinformatics approaches, this study presents a comparative evaluation of both GPX and APX in 18 different plant species, and provides valuable insights into the nature and complex regulation of these enzymes. Herein, (a) potential GPX and APX genes/proteins from 18 different plant species were identified, (b) their exon/intron organization were analyzed, (c) detailed information about their physicochemical properties were provided, (d) conserved motif signatures of GPX and APX were identified, (e) their phylogenetic trees and 3D models were constructed, (f) protein-protein interaction networks were generated, and finally (g) GPX and APX gene expression profiles were analyzed. Study outcomes enlightened GPX and APX as major H2O2-scavenging enzymes at their structural and functional levels, which could be used in future studies in the current direction. PMID:27047498

  14. Bioinformatics based structural characterization of glucose dehydrogenase (gdh) gene and growth promoting activity of Leclercia sp. QAU-66.

    PubMed

    Naveed, Muhammad; Ahmed, Iftikhar; Khalid, Nauman; Mumtaz, Abdul Samad

    2014-01-01

    Glucose dehydrogenase (GDH; EC 1.1. 5.2) is the member of quinoproteins group that use the redox cofactor pyrroloquinoline quinoine, calcium ions and glucose as substrate for its activity. In present study, Leclercia sp. QAU-66, isolated from rhizosphere of Vigna mungo, was characterized for phosphate solubilization and the role of GDH in plant growth promotion of Phaseolus vulgaris. The strain QAU-66 had ability to solubilize phosphorus and significantly (p ≤ 0.05) promoted the shoot and root lengths of Phaseolus vulgaris. The structural determination of GDH protein was carried out using bioinformatics tools like Pfam, InterProScan, I-TASSER and COFACTOR. These tools predicted the structural based functional homology of pyrroloquinoline quinone domains in GDH. GDH of Leclercia sp. QAU-66 is one of the main factor that involved in plant growth promotion and provides a solid background for further research in plant growth promoting activities.

  15. Elongation Factor-Tu (EF-Tu) proteins structural stability and bioinformatics in ancestral gene reconstruction

    NASA Astrophysics Data System (ADS)

    Dehipawala, Sunil; Nguyen, A.; Tremberger, G.; Cheung, E.; Schneider, P.; Lieberman, D.; Holden, T.; Cheung, T.

    2013-09-01

    A paleo-experimental evolution report on elongation factor EF-Tu structural stability results has provided an opportunity to rewind the tape of life using the ancestral protein sequence reconstruction modeling approach; consistent with the book of life dogma in current biology and being an important component in the astrobiology community. Fractal dimension via the Higuchi fractal method and Shannon entropy of the DNA sequence classification could be used in a diagram that serves as a simple summary. Results from biomedical gene research provide examples on the diagram methodology. Comparisons between biomedical genes such as EEF2 (elongation factor 2 human, mouse, etc), WDR85 in epigenetics, HAR1 in human specificity, DLG1 in cognitive skill, and HLA-C in mosquito bite immunology with EF Tu DNA sequences have accounted for the reported circular dichroism thermo-stability data systematically; the results also infer a relatively less volatility geologic time period from 2 to 3 Gyr from adaptation viewpoint. Comparison to Thermotoga maritima MSB8 and Psychrobacter shows that Thermus thermophilus HB8 EF-Tu calibration sequence could be an outlier, consistent with free energy calculation by NUPACK. Diagram methodology allows computer simulation studies and HAR1 shows about 0.5% probability from chimp to human in terms of diagram location, and SNP simulation results such as amoebic meningoencephalitis NAF1 suggest correlation. Extensions to the studies of the translation and transcription elongation factor sequences in Megavirus Chiliensis, Megavirus Lba and Pandoravirus show that the studied Pandoravirus sequence could be an outlier with the highest fractal dimension and lowest entropy, as compared to chicken as a deviant in the DNMT3A DNA methylation gene sequences from zebrafish to human and to the less than one percent probability in computer simulation using the HAR1 0.5% probability as reference. The diagram methodology would be useful in ancestral gene

  16. Minimal Functional Sites in Metalloproteins and Their Usage in Structural Bioinformatics

    PubMed Central

    Rosato, Antonio; Valasatava, Yana; Andreini, Claudia

    2016-01-01

    Metal ions play a functional role in numerous biochemical processes and cellular pathways. Indeed, about 40% of all enzymes of known 3D structure require a metal ion to be able to perform catalysis. The interactions of the metals with the macromolecular framework determine their chemical properties and reactivity. The relevant interactions involve both the coordination sphere of the metal ion and the more distant interactions of the so-called second sphere, i.e., the non-bonded interactions between the macromolecule and the residues coordinating the metal (metal ligands). The metal ligands and the residues in their close spatial proximity define what we call a minimal functional site (MFS). MFSs can be automatically extracted from the 3D structures of metal-binding biological macromolecules deposited in the Protein Data Bank (PDB). They are 3D templates that describe the local environment around a metal ion or metal cofactor and do not depend on the overall macromolecular structure. MFSs provide a different view on metal-binding proteins and nucleic acids, completely focused on the metal. Here we present different protocols and tools based upon the concept of MFS to obtain deeper insight into the structural and functional properties of metal-binding macromolecules. We also show that structure conservation of MFSs in metalloproteins relates to local sequence similarity more strongly than to overall protein similarity. PMID:27153067

  17. Crowdsourcing for bioinformatics

    PubMed Central

    Good, Benjamin M.; Su, Andrew I.

    2013-01-01

    Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614

  18. Rapid development of entity-based data models for bioinformatics with persistence object-oriented design and structured interfaces.

    PubMed

    Ezra Tsur, Elishai

    2017-01-01

    Databases are imperative for research in bioinformatics and computational biology. Current challenges in database design include data heterogeneity and context-dependent interconnections between data entities. These challenges drove the development of unified data interfaces and specialized databases. The curation of specialized databases is an ever-growing challenge due to the introduction of new data sources and the emergence of new relational connections between established datasets. Here, an open-source framework for the curation of specialized databases is proposed. The framework supports user-designed models of data encapsulation, objects persistency and structured interfaces to local and external data sources such as MalaCards, Biomodels and the National Centre for Biotechnology Information (NCBI) databases. The proposed framework was implemented using Java as the development environment, EclipseLink as the data persistency agent and Apache Derby as the database manager. Syntactic analysis was based on J3D, jsoup, Apache Commons and w3c.dom open libraries. Finally, a construction of a specialized database for aneurysms associated vascular diseases is demonstrated. This database contains 3-dimensional geometries of aneurysms, patient's clinical information, articles, biological models, related diseases and our recently published model of aneurysms' risk of rapture. Framework is available in: http://nbel-lab.com.

  19. DOE EPSCoR Initiative in Structural and computational Biology/Bioinformatics

    SciTech Connect

    Wallace, Susan S.

    2008-02-21

    The overall goal of the DOE EPSCoR Initiative in Structural and Computational Biology was to enhance the competiveness of Vermont research in these scientific areas. To develop self-sustaining infrastructure, we increased the critical mass of faculty, developed shared resources that made junior researchers more competitive for federal research grants, implemented programs to train graduate and undergraduate students who participated in these research areas and provided seed money for research projects. During the time period funded by this DOE initiative: (1) four new faculty were recruited to the University of Vermont using DOE resources, three in Computational Biology and one in Structural Biology; (2) technical support was provided for the Computational and Structural Biology facilities; (3) twenty-two graduate students were directly funded by fellowships; (4) fifteen undergraduate students were supported during the summer; and (5) twenty-eight pilot projects were supported. Taken together these dollars resulted in a plethora of published papers, many in high profile journals in the fields and directly impacted competitive extramural funding based on structural or computational biology resulting in 49 million dollars awarded in grants (Appendix I), a 600% return on investment by DOE, the State and University.

  20. Structural and bioinformatic characterization of an Acinetobacter baumannii type II carrier protein

    SciTech Connect

    Allen, C. Leigh; Gulick, Andrew M.

    2014-06-01

    The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented. Microorganisms produce a variety of natural products via secondary metabolic biosynthetic pathways. Two of these types of synthetic systems, the nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), use large modular enzymes containing multiple catalytic domains in a single protein. These multidomain enzymes use an integrated carrier protein domain to transport the growing, covalently bound natural product to the neighboring catalytic domains for each step in the synthesis. Interestingly, some PKS and NRPS clusters contain free-standing domains that interact intermolecularly with other proteins. Being expressed outside the architecture of a multi-domain protein, these so-called type II proteins present challenges to understand the precise role they play. Additional structures of individual and multi-domain components of the NRPS enzymes will therefore provide a better understanding of the features that govern the domain interactions in these interesting enzyme systems. The high-resolution crystal structure of a free-standing carrier protein from Acinetobacter baumannii that belongs to a larger NRPS-containing operon, encoded by the ABBFA-003406–ABBFA-003399 genes of A. baumannii strain AB307-0294, that has been implicated in A. baumannii motility, quorum sensing and biofilm formation, is presented here. Comparison with the closest structural homologs of other carrier proteins identifies the requirements for a conserved glycine residue and additional important sequence and structural requirements within the regions that interact with partner proteins.

  1. Bioinformatics analysis of the structural and evolutionary characteristics for toll-like receptor 15

    PubMed Central

    Wang, Jinlan; Chang, Fen

    2016-01-01

    Toll-like receptors (TLRs) play important role in the innate immune system. TLR15 is reported to have a unique role in defense against pathogens, but its structural and evolution characterizations are still poorly understood. In this study, we identified 57 completed TLR15 genes from avian and reptilian genomes. TLR15 clustered into an individual clade and was closely related to family 1 on the phylogenetic tree. Unlike the TLRs in family 1 with the broken asparagine ladders in the middle, TLR15 ectodomain had an intact asparagine ladder that is critical to maintain the overall shape of ectodomain. The conservation analysis found that TLR15 ectodomain had a highly evolutionarily conserved region on the convex surface of LRR11 module, which is probably involved in TLR15 activation process. Furthermore, the protein–protein docking analysis indicated that TLR15 TIR domains have the potential to form homodimers, the predicted interaction interface of TIR dimer was formed mainly by residues from the BB-loops and αC-helixes. Although TLR15 mainly underwent purifying selection, we detected 27 sites under positive selection for TLR15, 24 of which are located on its ectodomain. Our observations suggest the structural features of TLR15 which may be relevant to its function, but which requires further experimental validation. PMID:27257554

  2. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  3. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes.

  4. [Bioinformatics: a key role in oncology].

    PubMed

    Olivier, Timothée; Chappuis, Pierre; Tsantoulis, Petros

    2016-05-18

    Bioinformatics is essential in clinical oncology and research. Combining biology, computer science and mathematics, bioinformatics aims to derive useful information from clinical and biological data, often poorly structured, at a large scale. Bioinformatics approaches have reclassified certain cancers based on their molecular and biological presentation, improving treatment selection. Many molecular signatures have been developed and, after validation, some are now usable in clinical practice. Other applications could facilitate daily practice, reduce the risk of error and increase the precision of medical decision-making. Bioinformatics must evolve in accordance with ethical considerations and requires multidisciplinary collaboration. Its application depends on a sound technical foundation that meets strict quality requirements.

  5. Bioinformatics-Aided Venomics

    PubMed Central

    Kaas, Quentin; Craik, David J.

    2015-01-01

    Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future. PMID:26110505

  6. Comparative transcriptional pathway bioinformatic analysis of dietary restriction, Sir2, p53 and resveratrol life span extension in Drosophila.

    PubMed

    Antosh, Michael; Whitaker, Rachel; Kroll, Adam; Hosier, Suzanne; Chang, Chengyi; Bauer, Johannes; Cooper, Leon; Neretti, Nicola; Helfand, Stephen L

    2011-03-15

    A multiple comparison approach using whole genome transcriptional arrays was used to identify genes and pathways involved in calorie restriction/dietary restriction (DR) life span extension in Drosophila. Starting with a gene centric analysis comparing the changes in common between DR and two DR related molecular genetic life span extending manipulations, Sir2 and p53, lead to a molecular confirmation of Sir2 and p53's similarity with DR and the identification of a small set of commonly regulated genes. One of the identified upregulated genes, takeout, known to be involved in feeding and starvation behavior, and to have sequence homology with Juvenile Hormone (JH) binding protein, was shown to directly extend life span when specifically overexpressed. Here we show that a pathway centric approach can be used to identify shared physiological pathways between DR and Sir2, p53 and resveratrol life span extending interventions. The set of physiological pathways in common among these life span extending interventions provides an initial step toward defining molecular genetic and physiological changes important in life span extension. The large overlap in shared pathways between DR, Sir2, p53 and resveratrol provide strong molecular evidence supporting the genetic studies linking these specific life span extending interventions.

  7. A bioinformatics approach for integrated transcriptomic and proteomic comparative analyses of model and non-sequenced anopheline vectors of human malaria parasites.

    PubMed

    Ubaida Mohien, Ceereena; Colquhoun, David R; Mathias, Derrick K; Gibbons, John G; Armistead, Jennifer S; Rodriguez, Maria C; Rodriguez, Mario Henry; Edwards, Nathan J; Hartler, Jürgen; Thallinger, Gerhard G; Graham, David R; Martinez-Barnetche, Jesus; Rokas, Antonis; Dinglasan, Rhoel R

    2013-01-01

    Malaria morbidity and mortality caused by both Plasmodium falciparum and Plasmodium vivax extend well beyond the African continent, and although P. vivax causes between 80 and 300 million severe cases each year, vivax transmission remains poorly understood. Plasmodium parasites are transmitted by Anopheles mosquitoes, and the critical site of interaction between parasite and host is at the mosquito's luminal midgut brush border. Although the genome of the "model" African P. falciparum vector, Anopheles gambiae, has been sequenced, evolutionary divergence limits its utility as a reference across anophelines, especially non-sequenced P. vivax vectors such as Anopheles albimanus. Clearly, technologies and platforms that bridge this substantial scientific gap are required in order to provide public health scientists with key transcriptomic and proteomic information that could spur the development of novel interventions to combat this disease. To our knowledge, no approaches have been published that address this issue. To bolster our understanding of P. vivax-An. albimanus midgut interactions, we developed an integrated bioinformatic-hybrid RNA-Seq-LC-MS/MS approach involving An. albimanus transcriptome (15,764 contigs) and luminal midgut subproteome (9,445 proteins) assembly, which, when used with our custom Diptera protein database (685,078 sequences), facilitated a comparative proteomic analysis of the midgut brush borders of two important malaria vectors, An. gambiae and An. albimanus.

  8. Bioinformatics and Moonlighting Proteins

    PubMed Central

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein–protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations – it requires the existence of multialigned family protein sequences – but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  9. Probing Medin Monomer Structure and its Amyloid Nucleation Using 13C-Direct Detection NMR in Combination with Structural Bioinformatics

    PubMed Central

    Davies, Hannah A.; Rigden, Daniel J.; Phelan, Marie M.; Madine, Jillian

    2017-01-01

    Aortic medial amyloid is the most prevalent amyloid found to date, but remarkably little is known about it. It is characterised by aberrant deposition of a 5.4 kDa protein called medin within the medial layer of large arteries. Here we employ a combined approach of ab initio protein modelling and 13C-direct detection NMR to generate a model for soluble monomeric medin comprising a stable core of three β-strands and shorter more labile strands at the termini. Molecular dynamics simulations suggested that detachment of the short, C-terminal β-strand from the soluble fold exposes key amyloidogenic regions as a potential site of nucleation enabling dimerisation and subsequent fibril formation. This mechanism resembles models proposed for several other amyloidogenic proteins suggesting that despite variations in sequence and protomer structure these proteins may share a common pathway for amyloid nucleation and subsequent protofibril and fibril formation. PMID:28327552

  10. Computational intelligence techniques in bioinformatics.

    PubMed

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included.

  11. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    PubMed

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-05

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.

  12. High-resolution modeling of antibody structures by a combination of bioinformatics, expert knowledge, and molecular simulations.

    PubMed

    Shirai, Hiroki; Ikeda, Kazuyoshi; Yamashita, Kazuo; Tsuchiya, Yuko; Sarmiento, Jamica; Liang, Shide; Morokata, Tatsuaki; Mizuguchi, Kenji; Higo, Junichi; Standley, Daron M; Nakamura, Haruki

    2014-08-01

    In the second antibody modeling assessment, we used a semiautomated template-based structure modeling approach for 11 blinded antibody variable region (Fv) targets. The structural modeling method involved several steps, including template selection for framework and canonical structures of complementary determining regions (CDRs), homology modeling, energy minimization, and expert inspection. The submitted models for Fv modeling in Stage 1 had the lowest average backbone root mean square deviation (RMSD) (1.06 Å). Comparison to crystal structures showed the most accurate Fv models were generated for 4 out of 11 targets. We found that the successful modeling in Stage 1 mainly was due to expert-guided template selection for CDRs, especially for CDR-H3, based on our previously proposed empirical method (H3-rules) and the use of position specific scoring matrix-based scoring. Loop refinement using fragment assembly and multicanonical molecular dynamics (McMD) was applied to CDR-H3 loop modeling in Stage 2. Fragment assembly and McMD produced putative structural ensembles with low free energy values that were scored based on the OSCAR all-atom force field and conformation density in principal component analysis space, respectively, as well as the degree of consensus between the two sampling methods. The quality of 8 out of 10 targets improved as compared with Stage 1. For 4 out of 10 Stage-2 targets, our method generated top-scoring models with RMSD values of less than 1 Å. In this article, we discuss the strengths and weaknesses of our approach as well as possible directions for improvement to generate better predictions in the future.

  13. Deep learning in bioinformatics.

    PubMed

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-07-29

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies.

  14. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  15. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  16. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    PubMed Central

    Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  17. GALT protein database, a bioinformatics resource for the management and analysis of structural features of a galactosemia-related protein and its mutants.

    PubMed

    d'Acierno, Antonio; Facchiano, Angelo; Marabotti, Anna

    2009-06-01

    We describe the GALT-Prot database and its related web-based application that have been developed to collect information about the structural and functional effects of mutations on the human enzyme galactose-1-phosphate uridyltransferase (GALT) involved in the genetic disease named galactosemia type I. Besides a list of missense mutations at gene and protein sequence levels, GALT-Prot reports the analysis results of mutant GALT structures. In addition to the structural information about the wild-type enzyme, the database also includes structures of over 100 single point mutants simulated by means of a computational procedure, and the analysis to each mutant was made with several bioinformatics programs in order to investigate the effect of the mutations. The web-based interface allows querying of the database, and several links are also provided in order to guarantee a high integration with other resources already present on the web. Moreover, the architecture of the database and the web application is flexible and can be easily adapted to store data related to other proteins with point mutations. GALT-Prot is freely available at http://bioinformatica.isa.cnr.it/GALT/.

  18. GlycoMinestruct: a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features

    PubMed Central

    Li, Fuyi; Li, Chen; Revote, Jerico; Zhang, Yang; Webb, Geoffrey I.; Li, Jian; Song, Jiangning; Lithgow, Trevor

    2016-01-01

    Glycosylation plays an important role in cell-cell adhesion, ligand-binding and subcellular recognition. Current approaches for predicting protein glycosylation are primarily based on sequence-derived features, while little work has been done to systematically assess the importance of structural features to glycosylation prediction. Here, we propose a novel bioinformatics method called GlycoMinestruct(http://glycomine.erc.monash.edu/Lab/GlycoMine_Struct/) for improved prediction of human N- and O-linked glycosylation sites by combining sequence and structural features in an integrated computational framework with a two-step feature-selection strategy. Experiments indicated that GlycoMinestruct outperformed NGlycPred, the only predictor that incorporated both sequence and structure features, achieving AUC values of 0.941 and 0.922 for N- and O-linked glycosylation, respectively, on an independent test dataset. We applied GlycoMinestruct to screen the human structural proteome and obtained high-confidence predictions for N- and O-linked glycosylation sites. GlycoMinestruct can be used as a powerful tool to expedite the discovery of glycosylation events and substrates to facilitate hypothesis-driven experimental studies. PMID:27708373

  19. An Online Bioinformatics Curriculum

    PubMed Central

    Searls, David B.

    2012-01-01

    Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field. PMID:23028269

  20. BioWarehouse: a bioinformatics database warehouse toolkit

    PubMed Central

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

    2006-01-01

    Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for

  1. Systematic, map-scale, comparative structural geology

    SciTech Connect

    Groshong, R.H. Jr.

    1985-01-01

    Interpretation by analogy is the basis of comparative structural geology. A systematic approach to analog selection aids in efficiency and in understanding. The basic interpretive unit for analog selection is the structural family: a map-scale assemblage of genetically related structural forms produced by deformation with approximately constant boundary conditions. A family is specified by the dominant component of its displacement field and by structural levels involved. The differential vertical displacement category includes intrusive and impact structures. The three important basement types are isotropic crystalline, quasisedimentary and metamorphosing. A family is either thin skinned or involves cover plus one of the three basement types. These parameters are arranged into a matrix to produce 20 pigeon holes. Some structures do not fall exactly into one pigeon hole. Other structures link two families; for example, gravity glide links thin-skinned extension and contraction. This system is analogous to end-member rock classifications. Not every example is an end member, but the concept of end members greatly speeds up comparative analysis and clarifies the choice of analogies. Future research will lead to better definition of the key characteristics of certain families, the relationships between families, and the possible existence of additional families.

  2. Teaching bioinformatics to engineers.

    PubMed

    Mihalas, George I; Tudor, Anca; Paralescu, Sorin; Andor, Minodora; Stoicu-Tivadar, Lacramioara

    2014-01-01

    The paper refers to our methodology and experience in establishing the content of the course in bioinformatics introduced to the school of "Information Systems in Healthcare" (SIIS), master level. The syllabi of both lectures and laboratory works are presented and discussed.

  3. Towards a career in bioinformatics

    PubMed Central

    2009-01-01

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508

  4. An Inquiry into Protein Structure and Genetic Disease: Introducing Undergraduates to Bioinformatics in a Large Introductory Course

    ERIC Educational Resources Information Center

    Bednarski, April E.; Elgin, Sarah C. R.; Pakrasi, Himadri B.

    2005-01-01

    This inquiry-based lab is designed around genetic diseases with a focus on protein structure and function. To allow students to work on their own investigatory projects, 10 projects on 10 different proteins were developed. Students are grouped in sections of 20 and work in pairs on each of the projects. To begin their investigation, students are…

  5. Highlighting computations in bioscience and bioinformatics: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB07).

    PubMed

    Lu, Guoqing; Ni, Jun

    2008-05-28

    The Second Symposium on Computations in Bioinformatics and Bioscience (SCBB07) was held in Iowa City, Iowa, USA, on August 13-15, 2007. This annual event attracted dozens of bioinformatics professionals and students, who are interested in solving emerging computational problems in bioscience, from China, Japan, Taiwan and the United States. The Scientific Committee of the symposium selected 18 peer-reviewed papers for publication in this supplemental issue of BMC Bioinformatics. These papers cover a broad spectrum of topics in computational biology and bioinformatics, including DNA, protein and genome sequence analysis, gene expression and microarray analysis, computational proteomics and protein structure classification, systems biology and machine learning.

  6. Bioinformatics pipeline for functional identification and characterization of proteins

    NASA Astrophysics Data System (ADS)

    Skarzyńska, Agnieszka; Pawełkowicz, Magdalena; Krzywkowski, Tomasz; Świerkula, Katarzyna; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    The new sequencing methods, called Next Generation Sequencing gives an opportunity to possess a vast amount of data in short time. This data requires structural and functional annotation. Functional identification and characterization of predicted proteins could be done by in silico approches, thanks to a numerous computational tools available nowadays. However, there is a need to confirm the results of proteins function prediction using different programs and comparing the results or confirm experimentally. Here we present a bioinformatics pipeline for structural and functional annotation of proteins.

  7. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants.

    PubMed

    Huang, Ming-Der; Huang, Anthony H C

    2015-09-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins.

  8. Bioinformatics Reveal Five Lineages of Oleosins and the Mechanism of Lineage Evolution Related to Structure/Function from Green Algae to Seed Plants1[OPEN

    PubMed Central

    Huang, Ming-Der; Huang, Anthony H.C.

    2015-01-01

    Plant cells contain subcellular lipid droplets with a triacylglycerol matrix enclosed by a layer of phospholipids and the small structural protein oleosin. Oleosins possess a conserved central hydrophobic hairpin of approximately 72 residues penetrating into the lipid droplet matrix and amphipathic amino- and carboxyl (C)-terminal peptides lying on the phospholipid surface. Bioinformatics of 1,000 oleosins of green algae and all plants emphasizing biological implications reveal five oleosin lineages: primitive (in green algae, mosses, and ferns), universal (U; all land plants), and three in specific organs or phylogenetic groups, termed seed low-molecular-weight (SL; seed plants), seed high-molecular-weight (SH; angiosperms), and tapetum (T; Brassicaceae) oleosins. Transition from one lineage to the next is depicted from lineage intermediates at junctions of phylogeny and organ distributions. Within a species, each lineage, except the T oleosin lineage, has one to four genes per haploid genome, only approximately two of which are active. Primitive oleosins already possess all the general characteristics of oleosins. U oleosins have C-terminal sequences as highly conserved as the hairpin sequences; thus, U oleosins including their C-terminal peptide exert indispensable, unknown functions. SL and SH oleosin transcripts in seeds are in an approximately 1:1 ratio, which suggests the occurrence of SL-SH oleosin dimers/multimers. T oleosins in Brassicaceae are encoded by rapidly evolved multitandem genes for alkane storage and transfer. Overall, oleosins have evolved to retain conserved hairpin structures but diversified for unique structures and functions in specific cells and plant families. Also, our studies reveal oleosin in avocado (Persea americana) mesocarp and no acyltransferase/lipase motifs in most oleosins. PMID:26232488

  9. Bioinformatics for Exploration

    NASA Technical Reports Server (NTRS)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  10. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers.

    PubMed

    Schneider, Maria V; Walter, Peter; Blatter, Marie-Claude; Watson, James; Brazas, Michelle D; Rother, Kristian; Budd, Aidan; Via, Allegra; van Gelder, Celia W G; Jacob, Joachim; Fernandes, Pedro; Nyrönen, Tommi H; De Las Rivas, Javier; Blicher, Thomas; Jimenez, Rafael C; Loveland, Jane; McDowall, Jennifer; Jones, Phil; Vaughan, Brendan W; Lopez, Rodrigo; Attwood, Teresa K; Brooksbank, Catherine

    2012-05-01

    Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response to the development of 'high-throughput biology', the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials. Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review it, respectively, to similar initiatives and collections.

  11. Distributed computing in bioinformatics.

    PubMed

    Jain, Eric

    2002-01-01

    This paper provides an overview of methods and current applications of distributed computing in bioinformatics. Distributed computing is a strategy of dividing a large workload among multiple computers to reduce processing time, or to make use of resources such as programs and databases that are not available on all computers. Participating computers may be connected either through a local high-speed network or through the Internet.

  12. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  13. A Bioinformatics Reference Model: Towards a Framework for Developing and Organising Bioinformatic Resources

    NASA Astrophysics Data System (ADS)

    Hiew, Hong Liang; Bellgard, Matthew

    2007-11-01

    Life Science research faces the constant challenge of how to effectively handle an ever-growing body of bioinformatics software and online resources. The users and developers of bioinformatics resources have a diverse set of competing demands on how these resources need to be developed and organised. Unfortunately, there does not exist an adequate community-wide framework to integrate such competing demands. The problems that arise from this include unstructured standards development, the emergence of tools that do not meet specific needs of researchers, and often times a communications gap between those who use the tools and those who supply them. This paper presents an overview of the different functions and needs of bioinformatics stakeholders to determine what may be required in a community-wide framework. A Bioinformatics Reference Model is proposed as a basis for such a framework. The reference model outlines the functional relationship between research usage and technical aspects of bioinformatics resources. It separates important functions into multiple structured layers, clarifies how they relate to each other, and highlights the gaps that need to be addressed for progress towards a diverse, manageable, and sustainable body of resources. The relevance of this reference model to the bioscience research community, and its implications in progress for organising our bioinformatics resources, are discussed.

  14. Improvement of Student Understanding of How Kinetic Data Facilitates the Determination of Amino Acid Catalytic Function through an Alkaline Phosphatase Structure/Mechanism Bioinformatics Exercise

    ERIC Educational Resources Information Center

    Grunwald, Sandra K.; Krueger, Katherine J.

    2008-01-01

    Laboratory exercises, which utilize alkaline phosphatase as a model enzyme, have been developed and used extensively in undergraduate biochemistry courses to illustrate enzyme steady-state kinetics. A bioinformatics laboratory exercise for the biochemistry laboratory, which complements the traditional alkaline phosphatase kinetics exercise, was…

  15. Fold assessment for comparative protein structure modeling

    PubMed Central

    Melo, Francisco; Sali, Andrej

    2007-01-01

    Accurate and automated assessment of both geometrical errors and incompleteness of comparative protein structure models is necessary for an adequate use of the models. Here, we describe a composite score for discriminating between models with the correct and incorrect fold. To find an accurate composite score, we designed and applied a genetic algorithm method that searched for a most informative subset of 21 input model features as well as their optimized nonlinear transformation into the composite score. The 21 input features included various statistical potential scores, stereochemistry quality descriptors, sequence alignment scores, geometrical descriptors, and measures of protein packing. The optimized composite score was found to depend on (1) a statistical potential z-score for residue accessibilities and distances, (2) model compactness, and (3) percentage sequence identity of the alignment used to build the model. The accuracy of the composite score was compared with the accuracy of assessment by single and combined features as well as by other commonly used assessment methods. The testing set was representative of models produced by automated comparative modeling on a genomic scale. The composite score performed better than any other tested score in terms of the maximum correct classification rate (i.e., 3.3% false positives and 2.5% false negatives) as well as the sensitivity and specificity across the whole range of thresholds. The composite score was implemented in our program MODELLER-8 and was used to assess models in the MODBASE database that contains comparative models for domains in approximately 1.3 million protein sequences. PMID:17905832

  16. Fold assessment for comparative protein structure modeling.

    PubMed

    Melo, Francisco; Sali, Andrej

    2007-11-01

    Accurate and automated assessment of both geometrical errors and incompleteness of comparative protein structure models is necessary for an adequate use of the models. Here, we describe a composite score for discriminating between models with the correct and incorrect fold. To find an accurate composite score, we designed and applied a genetic algorithm method that searched for a most informative subset of 21 input model features as well as their optimized nonlinear transformation into the composite score. The 21 input features included various statistical potential scores, stereochemistry quality descriptors, sequence alignment scores, geometrical descriptors, and measures of protein packing. The optimized composite score was found to depend on (1) a statistical potential z-score for residue accessibilities and distances, (2) model compactness, and (3) percentage sequence identity of the alignment used to build the model. The accuracy of the composite score was compared with the accuracy of assessment by single and combined features as well as by other commonly used assessment methods. The testing set was representative of models produced by automated comparative modeling on a genomic scale. The composite score performed better than any other tested score in terms of the maximum correct classification rate (i.e., 3.3% false positives and 2.5% false negatives) as well as the sensitivity and specificity across the whole range of thresholds. The composite score was implemented in our program MODELLER-8 and was used to assess models in the MODBASE database that contains comparative models for domains in approximately 1.3 million protein sequences.

  17. [Comparative hierarchic structure of the genetic language].

    PubMed

    Ratner, V A

    1993-05-01

    The genetical texts and genetic language are built according to hierarchic principle and contain no less than 6 levels of coding sequences, separated by marks of punctuation, separation and indication: codons, cistrons, scriptons, replicons, linkage groups, genomes. Each level has all the attributes of the language. This hierarchic system expresses some general properties and regularities. The rules of genetic language being determined, the variability of genetical texts is generated by block-modular combinatorics on each level. Between levels there are some intermediate sublevels and module types capable of being combined. The genetic language is compared with two different independent linguistic systems: human natural languages and artificial programming languages. Genetic language is a natural one by its origin, but it is a typical technical language of the functioning genetic regulatory system--by its predestination. All three linguistic systems under comparison have evident similarity of the organization principles and hierarchical structures. This argues for similarity of their principles of appearance and evolution.

  18. Virtual Bioinformatics Distance Learning Suite

    ERIC Educational Resources Information Center

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  19. Bioinformatics meets parasitology.

    PubMed

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention.

  20. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  1. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    ERIC Educational Resources Information Center

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  2. Comparing Factor Structures of Adolescent Psychopathology

    ERIC Educational Resources Information Center

    Verona, Edelyn; Javdani, Shabnam; Sprague, Jenessa

    2011-01-01

    Research on the structure of adolescent psychopathology can provide information on broad factors that underlie different forms of maladjustment in youths. Multiple studies from the literature on adult populations suggest that 2 factors, Internalizing and Externalizing, meaningfully comprise the factor structure of adult psychopathology (e.g.,…

  3. The Bioinformatics Analysis of Comparative Genomics of Mycobacterium tuberculosis Complex (MTBC) Provides Insight into Dissimilarities between Intraspecific Groups Differing in Host Association, Virulence, and Epitope Diversity

    PubMed Central

    Jia, Xinmiao; Yang, Li; Dong, Mengxing; Chen, Suting; Lv, Lingna; Cao, Dandan; Fu, Jing; Yang, Tingting; Zhang, Ju; Zhang, Xiangli; Shang, Yuanyuan; Wang, Guirong; Sheng, Yongjie; Huang, Hairong; Chen, Fei

    2017-01-01

    Tuberculosis now exceeds HIV as the top infectious disease cause of mortality, and is caused by the Mycobacterium tuberculosis complex (MTBC). MTBC strains have highly conserved genome sequences (similarity >99%) but dramatically different phenotypes. To analyze the relationship between genotype and phenotype, we conducted the comparative genomic analysis on 12 MTBC strains representing different lineages (i.e., Mycobacterium bovis; M. bovis BCG; M. microti; M. africanum; M. tuberculosis H37Rv; M. tuberculosis H37Ra, and six M. tuberculosis clinical isolates). The analysis focused on the three aspects of pathogenicity: host association, virulence, and epitope variations. Host association analysis indicated that eight mce3 genes, two enoyl-CoA hydratases, and five PE/PPE family genes were present only in human isolates; these may have roles in host-pathogen interactions. There were 15 SNPs found on virulence factors (including five SNPs in three ESX secretion proteins) only in the Beijing strains, which might be related to their more virulent phenotype. A comparison between the virulent H37Rv and non-virulent H37Ra strains revealed three SNPs that were likely associated with the virulence attenuation of H37Ra: S219L (PhoP), A219E (MazG) and a newly identified I228M (EspK). Additionally, a comparison of animal-associated MTBC strains showed that the deletion of the first four genes (i.e., pe35, ppe68, esxB, esxA), rather than all eight genes of RD1, might play a central role in the virulence attenuation of animal isolates. Finally, by comparing epitopes among MTBC strains, we found that four epitopes were lost only in the Beijing strains; this may render them better capable of evading the human immune system, leading to enhanced virulence. Overall, our comparative genomic analysis of MTBC strains reveals the relationship between the highly conserved genotypes and the diverse phenotypes of MTBC, provides insight into pathogenic mechanisms, and facilitates the

  4. An intelligent system for comparing protein structures

    SciTech Connect

    Benatan, E.

    1994-12-31

    An approach to protein structure comparison is presented which uses techniques of artificial intelligence (AI) to generate a mapping between two protein structures. The approach proceeds by first identifying the seed of a possible mapping, and then searching for ways to extend the seed by incorporating corresponding elements from the two proteins. Correspondence is judged using heuristic functions which assess the similarity of the structural environments of the elements. The search can be guided by separately encoded knowledge. A prototype has been implemented which is able to rapidly create mappings with a high degree of accuracy in test cases.

  5. Bioinformatic pipelines in Python with Leaf

    PubMed Central

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315

  6. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    PubMed Central

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  7. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    PubMed

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  8. [Application of bioinformatics in researches of industrial biocatalysis].

    PubMed

    Yu, Hui-Min; Luo, Hui; Shi, Yue; Sun, Xu-Dong; Shen, Zhong-Yao

    2004-05-01

    Industrial biocatalysis is currently attracting much attention to rebuild or substitute traditional producing process of chemicals and drugs. One of key focuses in industrial biocatalysis is biocatalyst, which is usually one kind of microbial enzyme. In the recent, new technologies of bioinformatics have played and will continue to play more and more significant roles in researches of industrial biocatalysis in response to the waves of genomic revolution. One of the key applications of bioinformatics in biocatalysis is the discovery and identification of the new biocatalyst through advanced DNA and protein sequence search, comparison and analyses in Internet database using different algorithm and software. The unknown genes of microbial enzymes can also be simply harvested by primer design on the basis of bioinformatics analyses. The other key applications of bioinformatics in biocatalysis are the modification and improvement of existing industrial biocatalyst. In this aspect, bioinformatics is of great importance in both rational design and directed evolution of microbial enzymes. Based on the successful prediction of tertiary structures of enzymes using the tool of bioinformatics, the undermentioned experiments, i.e. site-directed mutagenesis, fusion protein construction, DNA family shuffling and saturation mutagenesis, etc, are usually of very high efficiency. On all accounts, bioinformatics will be an essential tool for either biologist or biological engineer in the future researches of industrial biocatalysis, due to its significant function in guiding and quickening the step of discovery and/or improvement of novel biocatalysts.

  9. Genome Exploitation and Bioinformatics Tools

    NASA Astrophysics Data System (ADS)

    de Jong, Anne; van Heel, Auke J.; Kuipers, Oscar P.

    Bioinformatic tools can greatly improve the efficiency of bacteriocin screening efforts by limiting the amount of strains. Different classes of bacteriocins can be detected in genomes by looking at different features. Finding small bacteriocins can be especially challenging due to low homology and because small open reading frames (ORFs) are often omitted from annotations. In this chapter, several bioinformatic tools/strategies to identify bacteriocins in genomes are discussed.

  10. Reactance, Restoration, and Cognitive Structure: Comparative Statics

    ERIC Educational Resources Information Center

    Bessarabova, Elena; Fink, Edward L.; Turner, Monique

    2013-01-01

    This study (N = 143) examined the effects of freedom threat on cognitive structures, using recycling as its topic. The results of a 2(Freedom Threat: low vs. high) x 2(Postscript: restoration vs. filler) plus 1(Control) experiment indicated that, relative to the control condition, high freedom threat created a boomerang effect for the targeted…

  11. Uncertainty of Comparative Judgments and Multidimensional Structure

    ERIC Educational Resources Information Center

    Sjoberg, Lennart

    1975-01-01

    An analysis of preferences with respect to silhouette drawings of nude females is presented. Systematic intransitivities were discovered. The dispersions of differences (comparatal dispersons) were shown to reflect the multidimensional structure of the stimuli, a finding expected on the basis of prior work. (Author)

  12. Bioinformatics of cardiovascular miRNA biology.

    PubMed

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'.

  13. Compare, Contrast, Comprehend: Using Compare-Contrast Text Structures with ELLs in K-3 Classrooms

    ERIC Educational Resources Information Center

    Dreher, Mariam Jean; Gray, Jennifer Letcher

    2009-01-01

    In this article, we describe how to help primary-grade English language learners use compare-contrast text structures. Specifically, we explain (a) how to teach students to identify the compare-contrast text structure, and to use this structure to support their comprehension, (b) how to use compare-contrast texts to activate and extend students'…

  14. Bioconductor: open software development for computational biology and bioinformatics

    PubMed Central

    Gentleman, Robert C; Carey, Vincent J; Bates, Douglas M; Bolstad, Ben; Dettling, Marcel; Dudoit, Sandrine; Ellis, Byron; Gautier, Laurent; Ge, Yongchao; Gentry, Jeff; Hornik, Kurt; Hothorn, Torsten; Huber, Wolfgang; Iacus, Stefano; Irizarry, Rafael; Leisch, Friedrich; Li, Cheng; Maechler, Martin; Rossini, Anthony J; Sawitzki, Gunther; Smith, Colin; Smyth, Gordon; Tierney, Luke; Yang, Jean YH; Zhang, Jianhua

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples. PMID:15461798

  15. Comparative BioInformatics and Computational Toxicology

    EPA Science Inventory

    Reflecting the numerous changes in the field since the publication of the previous edition, this third edition of Developmental Toxicology focuses on the mechanisms of developmental toxicity and incorporates current technologies for testing in the risk assessment process.

  16. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  17. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  18. Adapting bioinformatics curricula for big data

    PubMed Central

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  19. Crustacean neuropeptides: structures, functions and comparative aspects.

    PubMed

    Keller, R

    1992-05-15

    In this article, an attempt is made to review the presently known, completely identified crustacean neuropeptides with regard to structure, function and distribution. Probably the most important progress has been made in the elucidation of a novel family of large peptides from the X-organ-sinus gland system which includes crustacean hyperglycemic hormone (CHH), putative molt-inhibiting hormone (MIH) and vitellogenesis (= gonad)-inhibiting hormone (VIH). These peptides have so far only been found in crustaceans. Renewed interest in the neurohemal pericardial organs has led to the identification of a number of cardioactive/myotropic neuropeptides, some of them unique to crustaceans. Important contributions have been made by immunocytochemical mapping of peptidergic neurons in the nervous system, which has provided evidence for a multiple role of several neuropeptides as neurohormones on the one hand and as local transmitters or modulators on the other. This has been corroborated by physiological studies. The long-known chromatophore-regulating hormones, red pigment concentrating hormone (RPCH) and pigment-dispending hormone (PDH), have been placed in a broader perspective by the demonstration of an additional role as local neuromodulators. The scope of crustacean neuropeptide research has thus been broadened considerably during the last years.

  20. Taking Bioinformatics to Systems Medicine.

    PubMed

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  1. Bioinformatics Approach in Plant Genomic Research

    PubMed Central

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-01-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  2. The growing need for microservices in bioinformatics

    PubMed Central

    Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.

    2016-01-01

    Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective

  3. Generations of interdisciplinarity in bioinformatics.

    PubMed

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L

    2016-04-02

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this "borderland." As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature.

  4. Generations of interdisciplinarity in bioinformatics

    PubMed Central

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  5. Bioinformatics in microbial biotechnology--a mini review.

    PubMed

    Bansal, Arvind K

    2005-06-28

    The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i) genomics--sequencing and comparative study of genomes to identify gene and genome functionality, (ii) proteomics--identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii) cell visualization and simulation to study and model cell behavior, and (iv) application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1) analysis based upon the available experimental wet-lab data, (2) the use of mathematical modeling to derive new information, and (3) an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene expression analysis to derive

  6. Promoting synergistic research and education in genomics and bioinformatics

    PubMed Central

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology. High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  7. Biopipe: a flexible framework for protocol-based bioinformatics analysis.

    PubMed

    Hoon, Shawn; Ratnapu, Kiran Kumar; Chia, Jer-Ming; Kumarasamy, Balamurugan; Juguang, Xiao; Clamp, Michele; Stabenau, Arne; Potter, Simon; Clarke, Laura; Stupka, Elia

    2003-08-01

    We identify several challenges facing bioinformatics analysis today. Firstly, to fulfill the promise of comparative studies, bioinformatics analysis will need to accommodate different sources of data residing in a federation of databases that, in turn, come in different formats and modes of accessibility. Secondly, the tsunami of data to be handled will require robust systems that enable bioinformatics analysis to be carried out in a parallel fashion. Thirdly, the ever-evolving state of bioinformatics presents new algorithms and paradigms in conducting analysis. This means that any bioinformatics framework must be flexible and generic enough to accommodate such changes. In addition, we identify the need for introducing an explicit protocol-based approach to bioinformatics analysis that will lend rigorousness to the analysis. This makes it easier for experimentation and replication of results by external parties. Biopipe is designed in an effort to meet these goals. It aims to allow researchers to focus on protocol design. At the same time, it is designed to work over a compute farm and thus provides high-throughput performance. A common exchange format that encapsulates the entire protocol in terms of the analysis modules, parameters, and data versions has been developed to provide a powerful way in which to distribute and reproduce results. This will enable researchers to discuss and interpret the data better as the once implicit assumptions are now explicitly defined within the Biopipe framework.

  8. The use of antioptimization to compare alternative structural models

    NASA Technical Reports Server (NTRS)

    Gangadharan, S. N.; Nikolaidis, E.; Lee, K.; Haftka, R. T.

    1993-01-01

    Structural models are usually tested by comparing their response with that of a reference structure (an actual structure or a more refined model) to a limited number of arbitrary loads. This test is not always reliable because the loads are arbitrary. An antioptimization-based method is proposed to test structural models. This method compares a structural model with a reference model or an actual structure under the worst loading case that maximizes the error in the model. Specifically, the method identifies the loading case that maximizes the difference between the responses of two models of the same structure using optimization. This method can be used to design experiments in order to validate a structural model. It can also be applied to identify damage in a structure by determining the load that maximizes the difference in the behavior of the damaged and the intact structure. The proposed method is illustrated by applying it to a plate and an automotive structure.

  9. Reproducible Bioinformatics Research for Biologists

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  10. Bioinformatics and the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  11. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  12. Clinical Bioinformatics: challenges and opportunities

    PubMed Central

    2012-01-01

    Background Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of -omics information. Conclusions Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput "-omics" technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research. PMID:23095472

  13. A comprehensive comparison of comparative RNA structure prediction approaches

    PubMed Central

    Gardner, Paul P; Giegerich, Robert

    2004-01-01

    Background An increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Yet, independent benchmarking of these algorithms is rarely performed as is now common practice for protein-folding, gene-finding and multiple-sequence-alignment algorithms. Results Here we evaluate a number of RNA folding algorithms using reliable RNA data-sets and compare their relative performance. Conclusions We conclude that comparative data can enhance structure prediction but structure-prediction-algorithms vary widely in terms of both sensitivity and selectivity across different lengths and homologies. Furthermore, we outline some directions for future research. PMID:15458580

  14. Tools for comparative protein structure modeling and analysis.

    PubMed

    Eswar, Narayanan; John, Bino; Mirkovic, Nebojsa; Fiser, Andras; Ilyin, Valentin A; Pieper, Ursula; Stuart, Ashley C; Marti-Renom, Marc A; Madhusudhan, M S; Yerkovich, Bozidar; Sali, Andrej

    2003-07-01

    The following resources for comparative protein structure modeling and analysis are described (http://salilab.org): MODELLER, a program for comparative modeling by satisfaction of spatial restraints; MODWEB, a web server for automated comparative modeling that relies on PSI-BLAST, IMPALA and MODELLER; MODLOOP, a web server for automated loop modeling that relies on MODELLER; MOULDER, a CPU intensive protocol of MODWEB for building comparative models based on distant known structures; MODBASE, a comprehensive database of annotated comparative models for all sequences detectably related to a known structure; MODVIEW, a Netscape plugin for Linux that integrates viewing of multiple sequences and structures; and SNPWEB, a web server for structure-based prediction of the functional impact of a single amino acid substitution.

  15. Teaching bioinformatics in concert.

    PubMed

    Goodman, Anya L; Dekhtyar, Alex

    2014-11-01

    Can biology students without programming skills solve problems that require computational solutions? They can if they learn to cooperate effectively with computer science students. The goal of the in-concert teaching approach is to introduce biology students to computational thinking by engaging them in collaborative projects structured around the software development process. Our approach emphasizes development of interdisciplinary communication and collaboration skills for both life science and computer science students.

  16. Bioinformatics in the information age

    SciTech Connect

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  17. Ontologies for Bioinformatics

    PubMed Central

    Schuurman, Nadine; Leszczynski, Agnieszka

    2008-01-01

    The past twenty years have witnessed an explosion of biological data in diverse database formats governed by heterogeneous infrastructures. Not only are semantics (attribute terms) different in meaning across databases, but their organization varies widely. Ontologies are a concept imported from computing science to describe different conceptual frameworks that guide the collection, organization and publication of biological data. An ontology is similar to a paradigm but has very strict implications for formatting and meaning in a computational context. The use of ontologies is a means of communicating and resolving semantic and organizational differences between biological databases in order to enhance their integration. The purpose of interoperability (or sharing between divergent storage and semantic protocols) is to allow scientists from around the world to share and communicate with each other. This paper describes the rapid accumulation of biological data, its various organizational structures, and the role that ontologies play in interoperability. PMID:19812775

  18. Omics technologies, data and bioinformatics principles.

    PubMed

    Schneider, Maria V; Orchard, Sandra

    2011-01-01

    We provide an overview on the state of the art for the Omics technologies, the types of omics data and the bioinformatics resources relevant and related to Omics. We also illustrate the bioinformatics challenges of dealing with high-throughput data. This overview touches several fundamental aspects of Omics and bioinformatics: data standardisation, data sharing, storing Omics data appropriately and exploring Omics data in bioinformatics. Though the principles and concepts presented are true for the various different technological fields, we concentrate in three main Omics fields namely: genomics, transcriptomics and proteomics. Finally we address the integration of Omics data, and provide several useful links for bioinformatics and Omics.

  19. A Bioinformatics Facility for NASA

    NASA Technical Reports Server (NTRS)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  20. Protein bioinformatics applied to virology.

    PubMed

    Mohabatkar, Hassan; Keyhanfar, Mehrnaz; Behbahani, Mandana

    2012-09-01

    Scientists have united in a common search to sequence, store and analyze genes and proteins. In this regard, rapidly evolving bioinformatics methods are providing valuable information on these newly-discovered molecules. Understanding what has been done and what we can do in silico is essential in designing new experiments. The unbalanced situation between sequence-known proteins and attribute-known proteins, has called for developing computational methods or high-throughput automated tools for fast and reliably predicting or identifying various characteristics of uncharacterized proteins. Taking into consideration the role of viruses in causing diseases and their use in biotechnology, the present review describes the application of protein bioinformatics in virology. Therefore, a number of important features of viral proteins like epitope prediction, protein docking, subcellular localization, viral protease cleavage sites and computer based comparison of their aspects have been discussed. This paper also describes several tools, principally developed for viral bioinformatics. Prediction of viral protein features and learning the advances in this field can help basic understanding of the relationship between a virus and its host.

  1. Bioinformatic identification of plant peptides.

    PubMed

    Lease, Kevin A; Walker, John C

    2010-01-01

    Plant peptides play a number of important roles in defence, development and many other aspects of plant physiology. Identifying additional peptide sequences provides the starting point to investigate their function using molecular, genetic or biochemical techniques. Due to their small size, identifying peptide sequences may not succeed using the default bioinformatic approaches that work well for average-sized proteins. There are two general scenarios related to bioinformatic identification of peptides to be discussed in this paper. In the first scenario, one already has the sequence of a plant peptide and is trying to find more plant peptides with some sequence similarity to the starting peptide. To do this, the Basic Local Alignment Search Tool (BLAST) is employed, with the parameters adjusted to be more favourable for identifying potential peptide matches. A second scenario involves trying to identify plant peptides without using sequence similarity searches to known plant peptides. In this approach, features such as protein size and the presence of a cleavable amino-terminal signal peptide are used to screen annotated proteins. A variation of this method can be used to screen for unannotated peptides from genomic sequences. Bioinformatic resources related to Arabidopsis thaliana will be used to illustrate these approaches.

  2. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  3. Robust enzyme design: bioinformatic tools for improved protein stability.

    PubMed

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.

  4. A Statistical Test for Comparing Nonnested Covariance Structure Models.

    ERIC Educational Resources Information Center

    Levy, Roy; Hancock, Gregory R.

    While statistical procedures are well known for comparing hierarchically related (nested) covariance structure models, statistical tests for comparing nonhierarchically related (nonnested) models have proven more elusive. While isolated attempts have been made, none exists within the commonly used maximum likelihood estimation framework, thereby…

  5. Comparative modeling of InP solar cell structures

    NASA Technical Reports Server (NTRS)

    Jain, R. K.; Weinberg, I.; Flood, D. J.

    1991-01-01

    The comparative modeling of p(+)n and n(+)p indium phosphide solar cell structures is studied using a numerical program PC-1D. The optimal design study has predicted that the p(+)n structure offers improved cell efficiencies as compared to n(+)p structure, due to higher open-circuit voltage. The various cell material and process parameters to achieve the maximum cell efficiencies are reported. The effect of some of the cell parameters on InP cell I-V characteristics was studied. The available radiation resistance data on n(+)p and p(+)p InP solar cells are also critically discussed.

  6. Bioinformatic Analysis of Gene Expression for Melanoma Treatment

    PubMed Central

    Kawakami, Akinori; Fisher, David E.

    2016-01-01

    Bioinformatic analysis of genome-wide gene expression allows us to characterize cells, including melanomas. Gene expression profiles have been generated in various stages of melanomas and analyzed by researchers in unique ways. Lauss et al. compared their melanoma subtypes with those of The Cancer Genome Atlas Network and found consistency between the two studies. PMID:27884291

  7. Bioinformatics in Africa: The Rise of Ghana?

    PubMed Central

    Karikari, Thomas K.

    2015-01-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  8. DSSTOX STRUCTURE-SEARCHABLE PUBLIC TOXICITY DATABASE NETWORK: CURRENT PROGRESS AND NEW INITIATIVES TO IMPROVE CHEMO-BIOINFORMATICS CAPABILITIES

    EPA Science Inventory

    The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...

  9. Comparative testing of nondestructive examination techniques for concrete structures

    NASA Astrophysics Data System (ADS)

    Clayton, Dwight A.; Smith, Cyrus M.

    2014-03-01

    A multitude of concrete-based structures are typically part of a light water reactor (LWR) plant to provide foundation, support, shielding, and containment functions. Concrete has been used in the construction of nuclear power plants (NPPs) because of three primary properties, its inexpensiveness, its structural strength, and its ability to shield radiation. Examples of concrete structures important to the safety of LWR plants include containment building, spent fuel pool, and cooling towers. Comparative testing of the various NDE concrete measurement techniques requires concrete samples with known material properties, voids, internal microstructure flaws, and reinforcement locations. These samples can be artificially created under laboratory conditions where the various properties can be controlled. Other than NPPs, there are not many applications where critical concrete structures are as thick and reinforced. Therefore, there are not many industries other than the nuclear power plant or power plant industry that are interested in performing NDE on thick and reinforced concrete structures. This leads to the lack of readily available samples of thick and heavily reinforced concrete for performing NDE evaluations, research, and training. The industry that typically performs the most NDE on concrete structures is the bridge and roadway industry. While bridge and roadway structures are thinner and less reinforced, they have a good base of NDE research to support their field NDE programs to detect, identify, and repair concrete failures. This paper will summarize the initial comparative testing of two concrete samples with an emphasis on how these techniques could perform on NPP concrete structures.

  10. Development of computations in bioscience and bioinformatics and its application: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB06).

    PubMed

    Deng, Youping; Ni, Jun; Zhang, Chaoyang

    2006-12-12

    The first symposium of computations in bioinformatics and bioscience (SCBB06) was held in Hangzhou, China on June 21-22, 2006. Twenty-six peer-reviewed papers were selected for publication in this special issue of BMC Bioinformatics. These papers cover a broad range of topics including bioinformatics theories, algorithms, applications and tool development. The main technical topics contain gene expression analysis, sequence analysis, genome analysis, phylogenetic analysis, gene function prediction, molecular interaction and system biology, genetics and population study, immune strategy, protein structure prediction and proteomics.

  11. Emerging strengths in Asia Pacific bioinformatics

    PubMed Central

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-01-01

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20–23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts. PMID:19091008

  12. Bioinformatics by Example: From Sequence to Target

    NASA Astrophysics Data System (ADS)

    Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

    2002-12-01

    With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.

  13. Rapid Bioinformatic Identification of Thermostabilizing Mutations

    PubMed Central

    Sauer, David B.; Karpowich, Nathan K.; Song, Jin Mei; Wang, Da-Neng

    2015-01-01

    Ex vivo stability is a valuable protein characteristic but is laborious to improve experimentally. In addition to biopharmaceutical and industrial applications, stable protein is important for biochemical and structural studies. Taking advantage of the large number of available genomic sequences and growth temperature data, we present two bioinformatic methods to identify a limited set of amino acids or positions that likely underlie thermostability. Because these methods allow thousands of homologs to be examined in silico, they have the advantage of providing both speed and statistical power. Using these methods, we introduced, via mutation, amino acids from thermoadapted homologs into an exemplar mesophilic membrane protein, and demonstrated significantly increased thermostability while preserving protein activity. PMID:26445442

  14. Bioinformatics for personal genome interpretation.

    PubMed

    Capriotti, Emidio; Nehrt, Nathan L; Kann, Maricel G; Bromberg, Yana

    2012-07-01

    An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.

  15. MODBASE, a database of annotated comparative protein structure models.

    PubMed

    Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C; Ilyin, Valentin A; Sali, Andrej

    2002-01-01

    MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10(-4)) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server.

  16. Bioinformatics study of the mangrove actin genes

    NASA Astrophysics Data System (ADS)

    Basyuni, M.; Wasilah, M.; Sumardi

    2017-01-01

    This study describes the bioinformatics methods to analyze eight actin genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, subcellular localization, similarity, and phylogenetic. The physical and chemical properties of eight mangroves showed variation among the genes. The percentage of the secondary structure of eight mangrove actin genes followed the order of a helix > random coil > extended chain structure for BgActl, KcActl, RsActl, and A. corniculatum Act. In contrast to this observation, the remaining actin genes were random coil > extended chain structure > a helix. This study, therefore, shown the prediction of secondary structure was performed for necessary structural information. The values of chloroplast or signal peptide or mitochondrial target were too small, indicated that no chloroplast or mitochondrial transit peptide or signal peptide of secretion pathway in mangrove actin genes. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove actin genes. To clarify the relationship among the mangrove actin gene, a phylogenetic tree was constructed. Three groups of mangrove actin genes were formed, the first group contains B. gymnorrhiza BgAct and R. stylosa RsActl. The second cluster which consists of 5 actin genes the largest group, and the last branch consist of one gene, B. sexagula Act. The present study, therefore, supported the previous results that plant actin genes form distinct clusters in the tree.

  17. Evolving Strategies for the Incorporation of Bioinformatics Within the Undergraduate Cell Biology Curriculum

    PubMed Central

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489

  18. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  19. Structures of School Systems Worldwide: A Comparative Study

    ERIC Educational Resources Information Center

    Popov, Nikolay

    2012-01-01

    In the past 20 years I have been examining the structures of school systems worldwide. This ongoing research has been enriched by the findings obtained from the lecture course on Comparative Education I have been delivering to students in the Bachelor and Master's Education Programs at Sofia University, Bulgaria. This paper presents some results…

  20. Comparing High-latitude Ionospheric and Thermospheric Lagrangian Coherent Structures

    NASA Astrophysics Data System (ADS)

    Wang, N.; Ramirez, U.; Flores, F.; Okic, D.; Datta-Barua, S.

    2015-12-01

    Lagrangian Coherent Structures (LCSs) are invisible boundaries in time varying flow fields that may be subject to mixing and turbulence. The LCS is defined by the local maxima of the finite time Lyapunov exponent (FTLE), a scalar field quantifying the degree of stretching of fluid elements over the flow domain. Although the thermosphere is dominated by neutral wind processes and the ionosphere is governed by plasma electrodynamics, we can compare the LCS in the two modeled flow fields to yield insight into transport and interaction processes in the high-latitude IT system. For obtaining thermospheric LCS, we use the Horizontal Wind Model 2014 (HWM14) [1] at a single altitude to generate the two-dimensional velocity field. The FTLE computation is applied to study the flow field of the neutral wind, and to visualize the forward-time Lagrangian Coherent Structures in the flow domain. The time-varying structures indicate a possible thermospheric LCS ridge in the auroral oval area. The results of a two-day run during a geomagnetically quiet period show that the structures are diurnally quasi-periodic, thus that solar radiation influences the neutral wind flow field. To find the LCS in the high-latitude ionospheric drifts, the Weimer 2001 [2] polar electric potential model and the International Geomagnetic Reference Field 11 [3] are used to compute the ExB drift flow field in ionosphere. As with the neutral winds, the Lagrangian Coherent Structures are obtained by applying the FTLE computation. The relationship between the thermospheric and ionospheric LCS is analyzed by comparing overlapping FTLE maps. Both a publicly available FTLE solver [4] and a custom-built FTLE computation are used and compared for validation [5]. Comparing the modeled IT LCSs on a quiet day with the modeled IT LCSs on a storm day indicates important factors on the structure and time evolution of the LCS.

  1. Bioinformatic Identification of Rare Codon Clusters (RCCs) in HBV Genome and Evaluation of RCCs in Proteins Structure of Hepatitis B Virus

    PubMed Central

    Mortazavi, Mojtaba; Zarenezhad, Mohammad; Gholamzadeh, Saeid; Alavian, Seyed Moayed; Ghorbani, Mohammad; Dehghani, Reza; Malekpour, Abdorrasoul; Meshkibaf, Mohammadhasan; Fakhrzad, Ali

    2016-01-01

    Background Hepatitis B virus (HBV) as an infectious disease that has nine genotypes (A - I) and a ‘putative’ genotype J. Objectives The aim of this study was to identify the rare codon clusters (RCC) in the HBV genome and to evaluate these RCCs in the HBV proteins structure. Methods For detection of protein family accession numbers (Pfam) in HBV proteins, the UniProt database and Pfam search tool were used. Protein family accession numbers is a comprehensive and accurate collection of protein domains and families. It contains annotation of each family in the form of textual descriptions, links to other resources and literature references. Genome projects have used Pfam extensively for large-scale functional annotation of genomic data; Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). The Pfam search tools are databases that identify Pfam of proteins. These Pfam IDs were analyzed in Sherlocc program and the location of RCCs in HBV genome and proteins were detected and reported as translated EMBL nucleotide sequence data library (TrEMBL) entries. The TrEMBL is a computer-annotated supplement of SWISS-PROT that contains all the translations of European molecular biology laboratory (EMBL) nucleotide sequence entries not yet integrated in SWISS-PROT. Furthermore, the structures of TrEMBL entries proteins were studied in the PDB database and 3D structures of the HBV proteins and locations of RCCs were visualized and studied using Swiss PDB Viewer software®. Results The Pfam search tool found nine protein families in three frames. Results of Pfams studies in the Sherlocc program showed that this program has not identified RCCs in the external core antigen (PF08290) and truncated HBeAg gene (PF08290) of HBV. By contrast, the RCCs were identified in gene of hepatitis core antigen (PF00906 and the residues 224 - 234 and 251 - 255), large envelope protein S (PF00695 and the residues

  2. Structural and bioinformatic analysis of the kiwifruit allergen Act d 11, a member of the family of ripening-related proteins.

    PubMed

    Chruszcz, Maksymilian; Ciardiello, Maria Antonietta; Osinski, Tomasz; Majorek, Karolina A; Giangrieco, Ivana; Font, Jose; Breiteneder, Heimo; Thalassinos, Konstantinos; Minor, Wladek

    2013-12-01

    The allergen Act d 11, also known as kirola, is a 17 kDa protein expressed in large amounts in ripe green and yellow-fleshed kiwifruit. Ten percent of all kiwifruit-allergic individuals produce IgE specific for the protein. Using X-ray crystallography, we determined the first three-dimensional structures of Act d 11, produced from both recombinant expression in Escherichia coli and from the natural source (kiwifruit). While Act d 11 is immunologically correlated with the birch pollen allergen Bet v 1 and other members of the pathogenesis-related protein family 10 (PR-10), it has low sequence similarity to PR-10 proteins. By sequence Act d 11 appears instead to belong to the major latex/ripening-related (MLP/RRP) family, but analysis of the crystal structures shows that Act d 11 has a fold very similar to that of Bet v 1 and other PR-10 related allergens regardless of the low sequence identity. The structures of both the natural and recombinant protein include an unidentified ligand, which is relatively small (about 250 Da by mass spectrometry experiments) and most likely contains an aromatic ring. The ligand-binding cavity in Act d 11 is also significantly smaller than those in PR-10 proteins. The binding of the ligand, which we were not able to unambiguously identify, results in conformational changes in the protein that may have physiological and immunological implications. Interestingly, the residue corresponding to Glu45 in Bet v 1 (Glu46), which is important for IgE binding to the birch pollen allergen, is conserved in Act d 11, even though it is not in other allergens with significantly higher sequence identity to Bet v 1. We suggest that the so-called Gly-rich loop (or P-loop), which is conserved in all PR-10 allergens, may be responsible for IgE cross-reactivity between Bet v 1 and Act d 11.

  3. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  4. A Mathematical Optimization Problem in Bioinformatics

    ERIC Educational Resources Information Center

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  5. Fuzzy Logic in Medicine and Bioinformatics

    PubMed Central

    Torres, Angela; Nieto, Juan J.

    2006-01-01

    The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions) and in bioinformatics (comparison of genomes). PMID:16883057

  6. Rapid Development of Bioinformatics Education in China

    ERIC Educational Resources Information Center

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related…

  7. Biology in 'silico': The Bioinformatics Revolution.

    ERIC Educational Resources Information Center

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  8. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Cancer.gov

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  9. Technical phosphoproteomic and bioinformatic tools useful in cancer research

    PubMed Central

    2011-01-01

    Reversible protein phosphorylation is one of the most important forms of cellular regulation. Thus, phosphoproteomic analysis of protein phosphorylation in cells is a powerful tool to evaluate cell functional status. The importance of protein kinase-regulated signal transduction pathways in human cancer has led to the development of drugs that inhibit protein kinases at the apex or intermediary levels of these pathways. Phosphoproteomic analysis of these signalling pathways will provide important insights for operation and connectivity of these pathways to facilitate identification of the best targets for cancer therapies. Enrichment of phosphorylated proteins or peptides from tissue or bodily fluid samples is required. The application of technologies such as phosphoenrichments, mass spectrometry (MS) coupled to bioinformatics tools is crucial for the identification and quantification of protein phosphorylation sites for advancing in such relevant clinical research. A combination of different phosphopeptide enrichments, quantitative techniques and bioinformatic tools is necessary to achieve good phospho-regulation data and good structural analysis of protein studies. The current and most useful proteomics and bioinformatics techniques will be explained with research examples. Our aim in this article is to be helpful for cancer research via detailing proteomics and bioinformatic tools. PMID:21967744

  10. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    PubMed

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  11. Accuracy of functional surfaces on comparatively modeled protein structures

    PubMed Central

    Zhao, Jieling; Dundas, Joe; Kachalo, Sema; Ouyang, Zheng; Liang, Jie

    2012-01-01

    Identification and characterization of protein functional surfaces are important for predicting protein function, understanding enzyme mechanism, and docking small compounds to proteins. As the rapid speed of accumulation of protein sequence information far exceeds that of structures, constructing accurate models of protein functional surfaces and identify their key elements become increasingly important. A promising approach is to build comparative models from sequences using known structural templates such as those obtained from structural genome projects. Here we assess how well this approach works in modeling binding surfaces. By systematically building three-dimensional comparative models of proteins using Modeller, we determine how well functional surfaces can be accurately reproduced. We use an alpha shape based pocket algorithm to compute all pockets on the modeled structures, and conduct a large-scale computation of similarity measurements (pocket RMSD and fraction of functional atoms captured) for 26,590 modeled enzyme protein structures. Overall, we find that when the sequence fragment of the binding surfaces has more than 45% identity to that of the tempalte protein, the modeled surfaces have on average an RMSD of 0.5 Å, and contain 48% or more of the binding surface atoms, with nearly all of the important atoms in the signatures of binding pockets captured. PMID:21541664

  12. The 2016 Bioinformatics Open Source Conference (BOSC)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science. PMID:27781083

  13. Bioinformatics clouds for big data manipulation

    PubMed Central

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475

  14. [Comparative analysis of spatial organization of myoglobins. II. Secondary structure].

    PubMed

    Korobov, V N; Nazarenko, V I; Radomskiĭ, N F; Starodub, N F

    1992-01-01

    An analysis of probability of distribution curves of alpha-helical sites and bends of polypeptide chains of myoglobins in half-water mammals (beaver, nutria, muskrat, otter) carried out in comparison with those of myoglobins of the horse and Sperm whale (X-ray diffraction analysis has revealed their tertiary structure) has revealed a coincidence of the secondary structure sites end bends of the chain in the studied respiratory hemoproteins of muscles. Despite a considerable number of amino acid substitutions the profiles of alpha-helicity and B-bends of the compared proteins are practically identical. This indicates to the "resistance" of the probability curves to amino acid substitutions and to retention of the tertiary structure of myoglobins in evolutionary remote species of the animals.

  15. Computational biology and bioinformatics in Nigeria.

    PubMed

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  16. When cloud computing meets bioinformatics: a review.

    PubMed

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  17. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  18. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  19. A comparative structural study of wet and dried ettringite

    SciTech Connect

    Renaudin, G.; Filinchuk, Y.; Neubauer, J.; Goetz-Neunhoeffer, F.

    2010-03-15

    Two different techniques were used to compare structural characteristics of 'wet' ettringite (stored in the synthesis mother liquid) and 'dried' ettringite (dried to 35% relative humidity over saturated CaCl{sub 2} solution). Lattice parameters and the water content in the channel region of the structure (site occupancy factor of the water molecule not bonded to cations) as well as microstructure parameters (size and strain) were determined from a Rietveld refinement on synchrotron powder diffraction data. Local environment of sulphate anions and of the hydrogen bonding network was characterized by Raman spectroscopy. Both techniques led to the same conclusion: the 'wet' ettringite sample immersed in the mother solution from the synthesis presents similar structural features as ettringite dried to 35% relative humidity. An increase of the a lattice parameter combined with a decrease of the c lattice parameter occurs on drying. The amount of structural water, the point symmetry of sulphate and the hydrogen bond network are unchanged when passing from the wet to the dried ettringite powder. Ettringite does not form a high-hydrate polymorph in equilibrium with alkaline solution, in contrast to the AFm phases that lose water molecules on drying. According to these results we conclude that ettringite precipitated in aqueous solution at the early hydration stages is of the same chemical composition as ettringite present in the hardening concrete.

  20. Comparative population structure of cavity-nesting sea ducks

    USGS Publications Warehouse

    Pearce, John M.; Eadie, John M.; Savard, Jean-Pierre L.; Christensen, Thomas K.; Berdeen, James; Taylor, Eric J.; Boyd, Sean; Einarsson, Árni

    2014-01-01

    A growing collection of mtDNA genetic information from waterfowl species across North America suggests that larger-bodied cavity-nesting species exhibit greater levels of population differentiation than smaller-bodied congeners. Although little is known about nest-cavity availability for these species, one hypothesis to explain differences in population structure is reduced dispersal tendency of larger-bodied cavity-nesting species due to limited abundance of large cavities. To investigate this hypothesis, we examined population structure of three cavity-nesting waterfowl species distributed across much of North America: Barrow's Goldeneye (Bucephala islandica), Common Goldeneye (B. clangula), and Bufflehead (B. albeola). We compared patterns of population structure using both variation in mtDNA control-region sequences and band-recovery data for the same species and geographic regions. Results were highly congruent between data types, showing structured population patterns for Barrow's and Common Goldeneye but not for Bufflehead. Consistent with our prediction, the smallest cavity-nesting species, the Bufflehead, exhibited the lowest level of population differentiation due to increased dispersal and gene flow. Results provide evidence for discrete Old and New World populations of Common Goldeneye and for differentiation of regional groups of both goldeneye species in Alaska, the Pacific Northwest, and the eastern coast of North America. Results presented here will aid management objectives that require an understanding of population delineation and migratory connectivity between breeding and wintering areas. Comparative studies such as this one highlight factors that may drive patterns of genetic diversity and population trends.

  1. Combined Biology and Bioinformatics Approaches to Breast Cancer

    DTIC Science & Technology

    2007-10-01

    embryogenesis or at birth, precluding their use for studying the role of LMO4 in postnatal mammary gland development[9]. So, we employed Whey Acidic...PCR showed that LMO4 knockout mice have low mRNA expression level of milk protein ( whey and casein) compared to their control (data not shown...taking advantage of my bioinformatics training, I propose to develop novel algorithms capable of processing information from mammary gland

  2. PATRIC, the bacterial bioinformatics database and analysis resource

    PubMed Central

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  3. Bioinformatics and the Undergraduate Curriculum Essay

    PubMed Central

    Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of bioinformatics as a new discipline has challenged many colleges and universities to keep current with their curricula, often in the face of static or dwindling resources. On the plus side, many bioinformatics modules and related databases and software programs are free and accessible online, and interdisciplinary partnerships between existing faculty members and their support staff have proved advantageous in such efforts. We present examples of strategies and methods that have been successfully used to incorporate bioinformatics content into undergraduate curricula. PMID:20810947

  4. Bioinformatics in Italy: BITS2011, the Eighth Annual Meeting of the Italian Society of Bioinformatics

    PubMed Central

    2012-01-01

    The BITS2011 meeting, held in Pisa on June 20-22, 2011, brought together more than 120 Italian researchers working in the field of Bioinformatics, as well as students in Bioinformatics, Computational Biology, Biology, Computer Sciences, and Engineering, representing a landscape of Italian bioinformatics research. This preface provides a brief overview of the meeting and introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement. PMID:22536954

  5. No-boundary thinking in bioinformatics research

    PubMed Central

    2013-01-01

    Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT). PMID:24192339

  6. Use or abuse of bioinformatic tools: a response to Samach.

    PubMed

    Muñoz-Fambuena, Natalia; Mesejo, Carlos; González-Mas, María C; Primo-Millo, Eduardo; Agustí, Manuel; Iglesias, Domingo J

    2013-03-01

    In a recent paper, we described for the first time the effects of fruit on the expression of putative homologues of genes involved in flowering pathways. It was our aim to provide insight into the molecular mechanisms underlying alternate bearing in citrus. However, a bioinformatics-based critique of our and other related papers has been given by Samach in the preceding Viewpoint article in this issue of Annals of Botany. The use of certain bioinformatic tools in a context of structural rather than functional genomics can cast doubts about the veracity of a large amount of data published in recent years. In this response, the contentions raised by Samach are analysed, and rebuttals of his criticisms are presented.

  7. Comparative Structure of Saturn's Rings from Cassini Radio Occultation Observations

    NASA Astrophysics Data System (ADS)

    Marouf, Essam A.; French, R. G.; Rappaport, N. J.; McGhee, C. A.; Wong, K.; Thomson, F. S.; Anabtawi, A.

    2007-10-01

    Radio occultations of Saturn's rings during the Cassini prime mission fall into three main groups, depending on the rings opening angle B. The first is a set of eight diametric occultations completed early in the mission (March-September/2005) when |B| was relatively large (19.5 to 23.5°). They permitted multiple-longitude profiling of relatively optically thick ring features, revealing detailed structure of enigmatic Ring B. The second is to be completed late in the mission when the rings are relatively closed (|B| < 10°). They will provide enhanced sensitivity to tenuous ring material, hence complementary information about small optical depth structure. Bridging the two groups is a third composed of two specially designed occultations recently completed (May-June/2007). They capture the intermediate range |B| 15°. Because the rings were still reasonably open, much of the structure was profiled. The different occultation geometry from the diametric group provided enhanced sensitivity to bending waves and other inclined features. We comparatively consider variability (or lack of) of observed ring structure with B and longitude. The variability when present can be true (dynamically forced features) or apparent (azimuthal asymmetry due to preferentially aligned gravitational wakes). The multiple-longitude coverage provides rich characterization of the true variability, including remarkable variations in the morphology of gap-embedded ringlets in Ring C, clear variations in the width of gaps in the Cassini Division, wavelike features in Ring C (the "Rosen Waves"), classical satellite wake profiles due to Pan, in addition to many density and few bending waves. For the apparent asymmetry, observed optical depth variations with B, viewing geometry, and wavelength constrain physical properties of the rings microstructure (particle sizes, particle-cluster sizes and orientation, spatial cluster density, vertical ring profile and physical thickness, ...). Complementary

  8. Comparing and distinguishing the structure of biological branching.

    PubMed

    Lamberton, Timothy O; Lefevre, James; Short, Kieran M; Smyth, Ian M; Hamilton, Nicholas A

    2015-01-21

    Bifurcating developmental branching morphogenesis gives rise to complex organs such as the lung and the ureteric tree of the kidney. However, a few quantitative methods or tools exist to compare and distinguish, at a structural level, the critical features of these important biological systems. Here we develop novel graph alignment techniques to quantify the structural differences of rooted bifurcating trees and demonstrate their application in the analysis of developing kidneys from in normal and mutant mice. We have developed two graph based metrics: graph discordance, which measures how well the graphs representing the branching structures of distinct trees graphs can be aligned or overlayed; and graph inclusion, which measures the degree of containment of a tree graph within another. To demonstrate the application of these approaches we first benchmark the discordance metric on a data set of 32 normal and 28Tgfβ(+/-) mutant mouse ureteric trees. We find that the discordance metric better distinguishes control and mutant mouse kidneys than alternative metrics based on graph size and fingerprints - the distribution of tip depths. Using this metric we then show that the structure of the mutant trees follows the same pattern as the normal kidneys, but undergo a major delay in elaboration at later stages. Analysis of both controls and mutants using the inclusion metric gives strong support to the hypothesis that ureteric tree growth is stereotypic. Additionally, we present a new generalised multi-tree alignment algorithm that minimises the sum of pairwise graph discordance and which can be used to generate maximum consensus trees that represent the archetype for fixed developmental stages. These tools represent an advance in the analysis and quantification of branching patterns and will be invaluable in gaining a deeper understanding of the mechanisms that drive development. All code is being made available with documentation and example data with this publication.

  9. Bioinformatic Approaches to Metabolic Pathways Analysis

    PubMed Central

    Maudsley, Stuart; Chadwick, Wayne; Wang, Liyun; Zhou, Yu; Martin, Bronwen; Park, Sung-Soo

    2015-01-01

    The growth and development in the last decade of accurate and reliable mass data collection techniques has greatly enhanced our comprehension of cell signaling networks and pathways. At the same time however, these technological advances have also increased the difficulty of satisfactorily analyzing and interpreting these ever-expanding datasets. At the present time, multiple diverse scientific communities including molecular biological, genetic, proteomic, bioinformatic, and cell biological, are converging upon a common endpoint, that is, the measurement, interpretation, and potential prediction of signal transduction cascade activity from mass datasets. Our ever increasing appreciation of the complexity of cellular or receptor signaling output and the structural coordination of intracellular signaling cascades has to some extent necessitated the generation of a new branch of informatics that more closely associates functional signaling effects to biological actions and even whole-animal phenotypes. The ability to untangle and hopefully generate theoretical models of signal transduction information flow from transmembrane receptor systems to physiological and pharmacological actions may be one of the greatest advances in cell signaling science. In this overview, we shall attempt to assist the navigation into this new field of cell signaling and highlight several methodologies and technologies to appreciate this exciting new age of signal transduction. PMID:21870222

  10. Bioinformatic approaches to metabolic pathways analysis.

    PubMed

    Maudsley, Stuart; Chadwick, Wayne; Wang, Liyun; Zhou, Yu; Martin, Bronwen; Park, Sung-Soo

    2011-01-01

    The growth and development in the last decade of accurate and reliable mass data collection techniques has greatly enhanced our comprehension of cell signaling networks and pathways. At the same time however, these technological advances have also increased the difficulty of satisfactorily analyzing and interpreting these ever-expanding datasets. At the present time, multiple diverse scientific communities including molecular biological, genetic, proteomic, bioinformatic, and cell biological, are converging upon a common endpoint, that is, the measurement, interpretation, and potential prediction of signal transduction cascade activity from mass datasets. Our ever increasing appreciation of the complexity of cellular or receptor signaling output and the structural coordination of intracellular signaling cascades has to some extent necessitated the generation of a new branch of informatics that more closely associates functional signaling effects to biological actions and even whole-animal phenotypes. The ability to untangle and hopefully generate theoretical models of signal transduction information flow from transmembrane receptor systems to physiological and pharmacological actions may be one of the greatest advances in cell signaling science. In this overview, we shall attempt to assist the navigation into this new field of cell signaling and highlight several methodologies and technologies to appreciate this exciting new age of signal transduction.

  11. Legal issues for chem-bioinformatics models.

    PubMed

    Duardo-Sanchez, Aliuska; Gonzalez-Diaz, Humberto

    2013-01-01

    Chem-Bioinformatic models connect the chemical structure of drugs and/or targets (protein, gen, RNA, microorganism, tissue, disease...) with drug biological activity over this target. On the other hand, a systematic judicial framework is needed to provide appropriate and relevant guidance for addressing various computing techniques as applied to scientific research in biosciences frontiers. This article reviews both: the use of the predictions made with models for regulatory purposes and how to protect (in legal terms) the models of molecular systems per se, and the software used to seek them. First we review: i) models as a tool for regulatory purposes, ii) Organizations Involved with Validation of models, iii) Regulatory Guidelines and Documents for models, iv) Models for Human Health and Environmental Endpoint, and v) Difficulties to Validation of models, and other issues. Next, we focused on the legal protection of models and software; including: a short summary of topics, and methods for legal protection of computer software. We close the review with a section that treats the taxes in software use.

  12. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    NASA Astrophysics Data System (ADS)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  13. Data structure of search & compare (S&C) reservation protocol

    NASA Astrophysics Data System (ADS)

    Markovič, Miroslav; Dubovan, Jozef; Dado, Milan; Benedikovič, Daniel; Litvík, Ján.

    2012-01-01

    On the present time, the most used technology of core networks is Wavelength-division multiplexing (WDM) which save a lot of bandwidth of optical fiber. But in each node all optical signals must be converted into the electrical domain, processed and converted back into the optical domain. Result of all these steps is that the data spend in the node a lot of time. This time decreases total available bandwidth in the optical networks. One of the results is that we compose WDM nodes which represent hybrid system of switching and controlling. If we use out-of-band signalizing it is simpler to separate control head from the data. For effective control and transmission of data over the optical networks, the reservation protocols are needed in WDM/OBS4,5. In today's networks exist a lot of the protocols, which have their own advantages and disadvantages. For our investigation it was chosen the reservation protocol called Search & Compare (S &C)1, because it uses parallel-segment based and parallel link reservation. The structure of data will be designed from the point of view of wavelength for transmission channels, length of optical burst, source and group addresses in the segment, number of nodes and the total time needed for switching. Structure of the protocol will contain all of the control messages which are necessary for reservation a path along all segments. The design of the protocol follows the ITU-T recommendation2,3.

  14. Protein structure prediction provides comparable performance to crystallographic structures in docking-based virtual screening.

    PubMed

    Du, Hongying; Brender, Jeffrey R; Zhang, Jian; Zhang, Yang

    2015-01-01

    Structure based virtual screening has largely been limited to protein targets for which either an experimental structure is available or a strongly homologous template exists so that a high-resolution model can be constructed. The performance of state of the art protein structure predictions in virtual screening in systems where only weakly homologous templates are available is largely untested. Using the challenging DUD database of structural decoys, we show here that even using templates with only weak sequence homology (<30% sequence identity) structural models can be constructed by I-TASSER which achieve comparable enrichment rates to using the experimental bound crystal structure in the majority of the cases studied. For 65% of the targets, the I-TASSER models, which are constructed essentially in the apo conformations, reached 70% of the virtual screening performance of using the holo-crystal structures. A correlation was observed between the success of I-TASSER in modeling the global fold and local structures in the binding pockets of the proteins versus the relative success in virtual screening. The virtual screening performance can be further improved by the recognition of chemical features of the ligand compounds. These results suggest that the combination of structure-based docking and advanced protein structure modeling methods should be a valuable approach to the large-scale drug screening and discovery studies, especially for the proteins lacking crystallographic structures.

  15. Non-structural carbohydrates in woody plants compared among laboratories.

    PubMed

    Quentin, Audrey G; Pinkard, Elizabeth A; Ryan, Michael G; Tissue, David T; Baggett, L Scott; Adams, Henry D; Maillard, Pascale; Marchand, Jacqueline; Landhäusser, Simon M; Lacointe, André; Gibon, Yves; Anderegg, William R L; Asao, Shinichi; Atkin, Owen K; Bonhomme, Marc; Claye, Caroline; Chow, Pak S; Clément-Vidal, Anne; Davies, Noel W; Dickman, L Turin; Dumbur, Rita; Ellsworth, David S; Falk, Kristen; Galiano, Lucía; Grünzweig, José M; Hartmann, Henrik; Hoch, Günter; Hood, Sharon; Jones, Joanna E; Koike, Takayoshi; Kuhlmann, Iris; Lloret, Francisco; Maestro, Melchor; Mansfield, Shawn D; Martínez-Vilalta, Jordi; Maucourt, Mickael; McDowell, Nathan G; Moing, Annick; Muller, Bertrand; Nebauer, Sergio G; Niinemets, Ülo; Palacio, Sara; Piper, Frida; Raveh, Eran; Richter, Andreas; Rolland, Gaëlle; Rosas, Teresa; Saint Joanis, Brigitte; Sala, Anna; Smith, Renee A; Sterck, Frank; Stinziano, Joseph R; Tobias, Mari; Unda, Faride; Watanabe, Makoto; Way, Danielle A; Weerasinghe, Lasantha K; Wild, Birgit; Wiley, Erin; Woodruff, David R

    2015-11-01

    Non-structural carbohydrates (NSC) in plant tissue are frequently quantified to make inferences about plant responses to environmental conditions. Laboratories publishing estimates of NSC of woody plants use many different methods to evaluate NSC. We asked whether NSC estimates in the recent literature could be quantitatively compared among studies. We also asked whether any differences among laboratories were related to the extraction and quantification methods used to determine starch and sugar concentrations. These questions were addressed by sending sub-samples collected from five woody plant tissues, which varied in NSC content and chemical composition, to 29 laboratories. Each laboratory analyzed the samples with their laboratory-specific protocols, based on recent publications, to determine concentrations of soluble sugars, starch and their sum, total NSC. Laboratory estimates differed substantially for all samples. For example, estimates for Eucalyptus globulus leaves (EGL) varied from 23 to 116 (mean = 56) mg g(-1) for soluble sugars, 6-533 (mean = 94) mg g(-1) for starch and 53-649 (mean = 153) mg g(-1) for total NSC. Mixed model analysis of variance showed that much of the variability among laboratories was unrelated to the categories we used for extraction and quantification methods (method category R(2) = 0.05-0.12 for soluble sugars, 0.10-0.33 for starch and 0.01-0.09 for total NSC). For EGL, the difference between the highest and lowest least squares means for categories in the mixed model analysis was 33 mg g(-1) for total NSC, compared with the range of laboratory estimates of 596 mg g(-1). Laboratories were reasonably consistent in their ranks of estimates among tissues for starch (r = 0.41-0.91), but less so for total NSC (r = 0.45-0.84) and soluble sugars (r = 0.11-0.83). Our results show that NSC estimates for woody plant tissues cannot be compared among laboratories. The relative changes in NSC between treatments measured within a laboratory

  16. Regulatory bioinformatics for food and drug safety.

    PubMed

    Healy, Marion J; Tong, Weida; Ostroff, Stephen; Eichler, Hans-Georg; Patak, Alex; Neuspiel, Margaret; Deluyker, Hubert; Slikker, William

    2016-10-01

    "Regulatory Bioinformatics" strives to develop and implement a standardized and transparent bioinformatic framework to support the implementation of existing and emerging technologies in regulatory decision-making. It has great potential to improve public health through the development and use of clinically important medical products and tools to manage the safety of the food supply. However, the application of regulatory bioinformatics also poses new challenges and requires new knowledge and skill sets. In the latest Global Coalition on Regulatory Science Research (GCRSR) governed conference, Global Summit on Regulatory Science (GSRS2015), regulatory bioinformatics principles were presented with respect to global trends, initiatives and case studies. The discussion revealed that datasets, analytical tools, skills and expertise are rapidly developing, in many cases via large international collaborative consortia. It also revealed that significant research is still required to realize the potential applications of regulatory bioinformatics. While there is significant excitement in the possibilities offered by precision medicine to enhance treatments of serious and/or complex diseases, there is a clear need for further development of mechanisms to securely store, curate and share data, integrate databases, and standardized quality control and data analysis procedures. A greater understanding of the biological significance of the data is also required to fully exploit vast datasets that are becoming available. The application of bioinformatics in the microbiological risk analysis paradigm is delivering clear benefits both for the investigation of food borne pathogens and for decision making on clinically important treatments. It is recognized that regulatory bioinformatics will have many beneficial applications by ensuring high quality data, validated tools and standardized processes, which will help inform the regulatory science community of the requirements

  17. [Post-translational modification (PTM) bioinformatics in China: progresses and perspectives].

    PubMed

    Zexian, Liu; Yudong, Cai; Xuejiang, Guo; Ao, Li; Tingting, Li; Jianding, Qiu; Jian, Ren; Shaoping, Shi; Jiangning, Song; Minghui, Wang; Lu, Xie; Yu, Xue; Ziding, Zhang; Xingming, Zhao

    2015-07-01

    Post-translational modifications (PTMs) are essential for regulating conformational changes, activities and functions of proteins, and are involved in almost all cellular pathways and processes. Identification of protein PTMs is the basis for understanding cellular and molecular mechanisms. In contrast with labor-intensive and time-consuming experiments, the PTM prediction using various bioinformatics approaches can provide accurate, convenient, and efficient strategies and generate valuable information for further experimental consideration. In this review, we summarize the current progresses made by Chineses bioinformaticians in the field of PTM Bioinformatics, including the design and improvement of computational algorithms for predicting PTM substrates and sites, design and maintenance of online and offline tools, establishment of PTM-related databases and resources, and bioinformatics analysis of PTM proteomics data. Through comparing similar studies in China and other countries, we demonstrate both advantages and limitations of current PTM bioinformatics as well as perspectives for future studies in China.

  18. Bioinformatics process management: information flow via a computational journal

    PubMed Central

    Feagan, Lance; Rohrer, Justin; Garrett, Alexander; Amthauer, Heather; Komp, Ed; Johnson, David; Hock, Adam; Clark, Terry; Lushington, Gerald; Minden, Gary; Frost, Victor

    2007-01-01

    This paper presents the Bioinformatics Computational Journal (BCJ), a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples. PMID:18053179

  19. Structure, function and evolution of the gas exchangers: comparative perspectives

    PubMed Central

    Maina, JN

    2002-01-01

    Over the evolutionary continuum, animals have faced similar fundamental challenges of acquiring molecular oxygen for aerobic metabolism. Under limitations and constraints imposed by factors such as phylogeny, behaviour, body size and environment, they have responded differently in founding optimal respiratory structures. A quintessence of the aphorism that ‘necessity is the mother of invention’, gas exchangers have been inaugurated through stiff cost–benefit analyses that have evoked transaction of trade-offs and compromises. Cogent structural–functional correlations occur in constructions of gas exchangers: within and between taxa, morphological complexity and respiratory efficiency increase with metabolic capacities and oxygen needs. Highly active, small endotherms have relatively better-refined gas exchangers compared with large, inactive ectotherms. Respiratory structures have developed from the plain cell membrane of the primeval prokaryotic unicells to complex multifunctional ones ofthe modern Metazoa. Regarding the respiratory medium used to extract oxygen from, animal life has had only two choices – water or air – within the biological range of temperature and pressure the only naturally occurring respirable fluids. In rarer cases, certain animalshave adapted to using both media. Gills (evaginated gas exchangers) are the primordial respiratory organs: they are the archetypal water breathing organs. Lungs (invaginated gas exchangers) are the model air breathing organs. Bimodal (transitional) breathers occupy the water–air interface. Presentation and exposure of external (water/air) and internal (haemolymph/blood) respiratory media, features determined by geometric arrangement of the conduits, are important features for gas exchange efficiency: counter-current, cross-current, uniform pool and infinite pool designs have variably developed. PMID:12430953

  20. Study on the Response Coefficient of Setback Structures Compared to Regular Moment Frame Structures

    SciTech Connect

    Mirghaderi, S. Rasoul; Khafaf, Bardia; Epackachi, Siamak

    2008-07-08

    In design practice of many countries, seismic analysis and proportioning of structures are usually based upon linear elastic analysis due to reduced seismic forces by response coefficient; R. Setback structures are one of the most popular shapes of the constructed buildings. In setback structures, the shape and proportions of the building have a major effect on distribution of earthquake forces as they work their way through the building. On the other hand, geometric configuration has a profound effect on the structural-dynamic response of a building. Therefore, when a building has irregular features, such as asymmetric in height or vertical discontinuity, the traditional assumptions used in development of seismic criteria for regular buildings may not be applicable. Inelastic seismic behavior of these types of structures seems to be quite different from the regular steel moment resisting structures in which the overall ductility is localized at beam-ends.In order to investigate the seismic behavior and estimate the Response Coefficient of those structures, nonlinear static analysis (pushover) are used for three categories of setback structures namely low rise, medium rise and high rise buildings with different setbacks in their height. The Response Coefficient are calculated and compared with those taken from regular type of moment frame structures.

  1. BioZone Exploting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    SciTech Connect

    Liu, L; Buttler, D; Paques, H; Pu, C; Critchlow

    2002-01-28

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  2. BioZoom: Exploiting Source-Capability Information for Integrated Access to Multiple Bioinformatics Data Sources

    SciTech Connect

    Liu, L; Buttler, D; Critchlow, T J; Han, W; Paques, H; Pu, C; Rocco, D

    2003-01-09

    Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source-capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.

  3. Penalized feature selection and classification in bioinformatics

    PubMed Central

    Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classification techniques—which belong to the family of embedded feature selection methods—for bioinformatics studies with high-dimensional input. Classification objective functions, penalty functions and computational algorithms are discussed. Our goal is to make interested researchers aware of these feature selection and classification methods that are applicable to high-dimensional bioinformatics data. PMID:18562478

  4. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    PubMed

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  5. BioShaDock: a community driven bioinformatics shared Docker-based tools registry

    PubMed Central

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191

  6. The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics

    PubMed Central

    Greenhill, Simon J.; Blust, Robert; Gray, Russell D.

    2008-01-01

    Phylogenetic methods have revolutionised evolutionary biology and have recently been applied to studies of linguistic and cultural evolution. However, the basic comparative data on the languages of the world required for these analyses is often widely dispersed in hard to obtain sources. Here we outline how our Austronesian Basic Vocabulary Database (ABVD) helps remedy this situation by collating wordlists from over 500 languages into one web-accessible database. We describe the technology underlying the ABVD and discuss the benefits that an evolutionary bioinformatic approach can provide. These include facilitating computational comparative linguistic research, answering questions about human prehistory, enabling syntheses with genetic data, and safe-guarding fragile linguistic information. PMID:19204825

  7. Bioinformatic Analysis of GJB2 Gene Missense Mutations.

    PubMed

    Yilmaz, Akin

    2015-04-01

    Gap junction beta 2 (GJB2) gene is the most commonly mutated connexin gene in patients with autosomal recessive and dominant hearing loss. According to Ensembl (release 74) database, 1347 sequence variations are reported in the GJB2 gene and about 13.5% of them are categorized as missense SNPs or nonsynonymous variant. Because of the high incidence of GJB2 mutations in hearing loss patients, revealing the molecular effect of GJB2 mutations on protein structure may also provide clear point of view regarding the molecular etiology of deafness. Hence, the aim of this study is to analyze structural and functional consequences of all known GJB2 missense variations to the Cx26 protein by applying multiple bioinformatics methods. Two-hundred and eleven nonsynonymous variants were collected from Ensembl release 74, Leiden Open Variation Database (LOVD) and The Human Gene Mutation Database (HGMD). A number of bioinformatic tools were utilized for predicting the effect of GJB2 missense mutations at the sequence, structural, and functional levels. Some of the mutations were found to locate highly conserved regions and have structural and functional properties. Moreover, GJB2 mutations were also found to affect Cx26 protein at the molecular level via loss or gain of disorder, catalytic site, and post-translational modifications, including methylation, glycosylation, and ubiquitination. Findings, presented here, demonstrated the application of bioinformatic algorithms to predict the effects of mutations causing hearing impairment. I expect, this type of analysis will serve as a start point for future experimental evaluation of the GJB2 gene mutations and it will also be helpful in evaluating other deafness-related gene mutations.

  8. Rabifier2: an improved bioinformatic classifier of Rab GTPases.

    PubMed

    Surkont, Jaroslaw; Diekmann, Yoan; Pereira-Leal, José B

    2016-10-22

    The Rab family of small GTPases regulates and provides specificity to the endomembrane trafficking system; each Rab subfamily is associated with specific pathways. Thus, characterization of Rab repertoires provides functional information about organisms and evolution of the eukaryotic cell. Yet, the complex structure of the Rab family limits the application of existing methods for protein classification. Here, we present a major redesign of the Rabifier, a bioinformatic pipeline for detection and classification of Rab GTPases. It is more accurate, significantly faster than the original version and is now open source, both the code and the data, allowing for community participation.

  9. Bioinformatics: A History of Evolution "In Silico"

    ERIC Educational Resources Information Center

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  10. Medical informatics and bioinformatics: a bibliometric study

    PubMed Central

    Bansard, Jean-Yves; Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Beltrame, Francesco; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Tollis, Ioannis; Van der Lei, Johan; Coatrieux, Jean-Louis

    2007-01-01

    This paper reports on an analysis of the bioinformatics and medical informatics literature with the objective to identify upcoming trends that are shared among both research fields to derive benefits from potential collaborative initiatives for their future. Our results present the main characteristics of the two fields and show that these domains are still relatively separated. PMID:17521073

  11. "Extreme Programming" in a Bioinformatics Class

    ERIC Educational Resources Information Center

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  12. Implementing bioinformatic workflows within the bioextract server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  13. Bioinformatics in Undergraduate Education: Practical Examples

    ERIC Educational Resources Information Center

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  14. Bioboxes: standardised containers for interchangeable bioinformatics software.

    PubMed

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  15. KDE Bioscience: platform for bioinformatics analysis workflows.

    PubMed

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research.

  16. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  17. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    EPA Science Inventory

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  18. Structural realism versus deployment realism: A comparative evaluation.

    PubMed

    Lyons, Timothy D

    2016-10-01

    In this paper I challenge and adjudicate between the two positions that have come to prominence in the scientific realism debate: deployment realism and structural realism. I discuss a set of cases from the history of celestial mechanics, including some of the most important successes in the history of science. To the surprise of the deployment realist, these are novel predictive successes toward which theoretical constituents that are now seen to be patently false were genuinely deployed. Exploring the implications for structural realism, I show that the need to accommodate these cases forces our notion of "structure" toward a dramatic depletion of logical content, threatening to render it explanatorily vacuous: the better structuralism fares against these historical examples, in terms of retention, the worse it fares in content and explanatory strength. I conclude by considering recent restrictions that serve to make "structure" more specific. I show however that these refinements will not suffice: the better structuralism fares in specificity and explanatory strength, the worse it fares against history. In light of these case studies, both deployment realism and structural realism are significantly threatened by the very historical challenge they were introduced to answer.

  19. Navigating the changing learning landscape: perspective from bioinformatics.ca

    PubMed Central

    Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  20. Agile parallel bioinformatics workflow management using Pwrake

    PubMed Central

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  1. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.

  2. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure

    NASA Astrophysics Data System (ADS)

    Lima, Thamires A.; Paschoal, Vitor H.; Faria, Luiz F. O.; Ribeiro, Mauro C. C.; Giles, Carlos

    2016-06-01

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114]+ and [N1444]+ proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444]+ as to [N1114]+ because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114]+ cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids.

  3. Comparing two tetraalkylammonium ionic liquids. I. Liquid phase structure.

    PubMed

    Lima, Thamires A; Paschoal, Vitor H; Faria, Luiz F O; Ribeiro, Mauro C C; Giles, Carlos

    2016-06-14

    X-ray scattering experiments at room temperature were performed for the ionic liquids n-butyl-trimethylammonium bis(trifluoromethanesulfonyl)imide, [N1114][NTf2], and methyl-tributylammonium bis(trifluoromethanesulfonyl)imide, [N1444][NTf2]. The peak in the diffraction data characteristic of charge ordering in [N1444][NTf2] is shifted to longer distances in comparison to [N1114][NTf2], but the peak characteristic of short-range correlations is shifted in [N1444][NTf2] to shorter distances. Molecular dynamics (MD) simulations were performed for these ionic liquids using force fields available from the literature, although with new sets of partial charges for [N1114](+) and [N1444](+) proposed in this work. The shifting of charge and adjacency peaks to opposite directions in these ionic liquids was found in the static structure factor, S(k), calculated by MD simulations. Despite differences in cation sizes, the MD simulations unravel that anions are allowed as close to [N1444](+) as to [N1114](+) because anions are located in between the angle formed by the butyl chains. The more asymmetric molecular structure of the [N1114](+) cation implies differences in partial structure factors calculated for atoms belonging to polar or non-polar parts of [N1114][NTf2], whereas polar and non-polar structure factors are essentially the same in [N1444][NTf2]. Results of this work shed light on controversies in the literature on the liquid structure of tetraalkylammonium based ionic liquids.

  4. Component-Based Approach for Educating Students in Bioinformatics

    ERIC Educational Resources Information Center

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  5. How do disordered regions achieve comparable functions to structured domains?

    PubMed

    Latysheva, Natasha S; Flock, Tilman; Weatheritt, Robert J; Chavali, Sreenivas; Babu, M Madan

    2015-06-01

    The traditional structure to function paradigm conceives of a protein's function as emerging from its structure. In recent years, it has been established that unstructured, intrinsically disordered regions (IDRs) in proteins are equally crucial elements for protein function, regulation and homeostasis. In this review, we provide a brief overview of how IDRs can perform similar functions to structured proteins, focusing especially on the formation of protein complexes and assemblies and the mediation of regulated conformational changes. In addition to highlighting instances of such functional equivalence, we explain how differences in the biological and physicochemical properties of IDRs allow them to expand the functional and regulatory repertoire of proteins. We also discuss studies that provide insights into how mutations within functional regions of IDRs can lead to human diseases.

  6. Determining protein similarity by comparing hydrophobic core structure.

    PubMed

    Gadzała, M; Kalinowska, B; Banach, M; Konieczny, L; Roterman, I

    2017-02-01

    Formal assessment of structural similarity is - next to protein structure prediction - arguably the most important unsolved problem in proteomics. In this paper we propose a similarity criterion based on commonalities between the proteins' hydrophobic cores. The hydrophobic core emerges as a result of conformational changes through which each residue reaches its intended position in the protein body. A quantitative criterion based on this phenomenon has been proposed in the framework of the CASP challenge. The structure of the hydrophobic core - including the placement and scope of any deviations from the idealized model - may indirectly point to areas of importance from the point of view of the protein's biological function. Our analysis focuses on an arbitrarily selected target from the CASP11 challenge. The proposed measure, while compliant with CASP criteria (70-80% correlation), involves certain adjustments which acknowledge the presence of factors other than simple spatial arrangement of solids.

  7. Comparative Analysis on Time Series with Included Structural Break

    NASA Astrophysics Data System (ADS)

    Andreeski, Cvetko J.; Vasant, Pandian

    2009-08-01

    The time series analysis (ARIMA models) is a good approach for identification of time series. But, if we have structural break in the time series, we cannot create only one model of time series. Further more, if we don't have enough data between two structural breaks, it's impossible to create valid time series models for identification of the time series. This paper explores the possibility of identification of the inflation process dynamics via of the system-theoretic, by means of both Box-Jenkins ARIMA methodologies and artificial neural networks.

  8. Comparative Effectiveness of Contextual and Structural Method of Teaching Vocabulary

    ERIC Educational Resources Information Center

    Behlol, Malik; Kaini, Mohammad Munir

    2011-01-01

    The study was conducted to find out effectiveness of contextual an, structural method of teaching vocabulary in English at secondary level. It was an experimental study in which the pretest posttest design was used. The population of the study was the students of secondary classes studying in Government secondary schools of Rawalpindi District.…

  9. Structural and Social Psychological Correlates of Prisonization: A Comparative Analysis.

    ERIC Educational Resources Information Center

    Thomas, Charles W.; And Others

    This study considers some aspects of "prisonization," or the process by which inmates adapt to confinement. Specifically, it further examines two ideas suggested by earlier studies. One is the belief that the structural characteristics of many prisons promote rather than inhibit assimilation into an inmate normative system that is opposed to the…

  10. Comparative structural biology of eubacterial and archaeal oligosaccharyltransferases.

    PubMed

    Maita, Nobuo; Nyirenda, James; Igura, Mayumi; Kamishikiryo, Jun; Kohda, Daisuke

    2010-02-12

    Oligosaccharyltransferase (OST) catalyzes the transfer of an oligosaccharide from a lipid donor to an asparagine residue in nascent polypeptide chains. In the bacterium Campylobacter jejuni, a single-subunit membrane protein, PglB, catalyzes N-glycosylation. We report the 2.8 A resolution crystal structure of the C-terminal globular domain of PglB and its comparison with the previously determined structure from the archaeon Pyrococcus AglB. The two distantly related oligosaccharyltransferases share unexpected structural similarity beyond that expected from the sequence comparison. The common architecture of the putative catalytic sites revealed a new catalytic motif in PglB. Site-directed mutagenesis analyses confirmed the contribution of this motif to the catalytic function. Bacterial PglB and archaeal AglB constitute a protein family of the catalytic subunit of OST along with STT3 from eukaryotes. A structure-aided multiple sequence alignment of the STT3/PglB/AglB protein family revealed three types of OST catalytic centers. This novel classification will provide a useful framework for understanding the enzymatic properties of the OST enzymes from Eukarya, Archaea, and Bacteria.

  11. Comparative static curing versus dynamic curing on tablet coating structures.

    PubMed

    Gendre, Claire; Genty, Muriel; Fayard, Barbara; Tfayli, Ali; Boiret, Mathieu; Lecoq, Olivier; Baron, Michel; Chaminade, Pierre; Péan, Jean Manuel

    2013-09-10

    Curing is generally required to stabilize film coating from aqueous polymer dispersion. This post-coating drying step is traditionally carried out in static conditions, requiring the transfer of solid dosage forms to an oven. But, curing operation performed directly inside the coating equipment stands for an attractive industrial application. Recently, the use of various advanced physico-chemical characterization techniques i.e., X-ray micro-computed tomography, vibrational spectroscopies (near infrared and Raman) and X-ray microdiffraction, allowed new insights into the film-coating structures of dynamically cured tablets. Dynamic curing end-point was efficiently determined after 4h. The aim of the present work was to elucidate the influence of curing conditions on film-coating structures. Results demonstrated that 24h of static curing and 4h of dynamic curing, both performed at 60°C and ambient relative humidity, led to similar coating layers in terms of drug release properties, porosity, water content, structural rearrangement of polymer chains and crystalline distribution. Furthermore, X-ray microdiffraction measurements pointed out different crystalline coating compositions depending on sample storage time. An aging mechanism might have occur during storage, resulting in the crystallization and the upward migration of cetyl alcohol, coupled to the downward migration of crystalline sodium lauryl sulfate within the coating layer. Interestingly, this new study clearly provided further knowledge into film-coating structures after a curing step and confirmed that curing operation could be performed in dynamic conditions.

  12. Can bioinformatics help in the identification of moonlighting proteins?

    PubMed

    Hernández, Sergio; Calvo, Alejandra; Ferragut, Gabriela; Franco, Luís; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2014-12-01

    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place.

  13. The MPI Bioinformatics Toolkit for protein sequence analysis

    PubMed Central

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N.

    2006-01-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at . PMID:16845021

  14. The MPI Bioinformatics Toolkit for protein sequence analysis.

    PubMed

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N

    2006-07-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at http://toolkit.tuebingen.mpg.de.

  15. Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks.

    PubMed

    Pauling, Josch; Klipp, Edda

    2016-12-22

    Lipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.

  16. Virulence factor activity relationships (VFARs): a bioinformatics perspective.

    PubMed

    Waseem, Hassan; Williams, Maggie R; Stedtfeld, Tiffany; Chai, Benli; Stedtfeld, Robert D; Cole, James R; Tiedje, James M; Hashsham, Syed A

    2017-03-06

    Virulence factor activity relationships (VFARs) - a concept loosely based on quantitative structure-activity relationships (QSARs) for chemicals was proposed as a predictive tool for ranking risks due to microorganisms relevant to water safety. A rapid increase in sequencing capabilities and bioinformatics tools has significantly increased the potential for VFAR-based analyses. This review summarizes more than 20 bioinformatics databases and tools, developed over the last decade, along with their virulence and antimicrobial resistance prediction capabilities. With the number of bacterial whole genome sequences exceeding 241 000 and metagenomic analysis projects exceeding 13 000 and the ability to add additional genome sequences for few hundred dollars, it is evident that further development of VFARs is not limited by the availability of information at least at the genomic level. However, additional information related to co-occurrence, treatment response, modulation of virulence due to environmental and other factors, and economic impact must be gathered and incorporated in a manner that also addresses the associated uncertainties. Of the bioinformatics tools, a majority are either designed exclusively for virulence/resistance determination or equipped with a dedicated module. The remaining have the potential to be employed for evaluating virulence. This review focusing broadly on omics technologies and tools supports the notion that these tools are now sufficiently developed to allow the application of VFAR approaches combined with additional engineering and economic analyses to rank and prioritize organisms important to a given niche. Knowledge gaps do exist but can be filled with focused experimental and theoretical analyses that were unimaginable a decade ago. Further developments should consider the integration of the measurement of activity, risk, and uncertainty to improve the current capabilities.

  17. Bioinformatic characterization of plant networks

    SciTech Connect

    McDermott, Jason E.; Samudrala, Ram

    2008-06-30

    Cells and organisms are governed by networks of interactions, genetic, physical and metabolic. Large-scale experimental studies of interactions between components of biological systems have been performed for a variety of eukaryotic organisms. However, there is a dearth of such data for plants. Computational methods for prediction of relationships between proteins, primarily based on comparative genomics, provide a useful systems-level view of cellular functioning and can be used to extend information about other eukaryotes to plants. We have predicted networks for Arabidopsis thaliana, Oryza sativa indica and japonica and several plant pathogens using the Bioverse (http://bioverse.compbio.washington.edu) and show that they are similar to experimentally-derived interaction networks. Predicted interaction networks for plants can be used to provide novel functional annotations and predictions about plant phenotypes and aid in rational engineering of biosynthesis pathways.

  18. Why Polyphenols have Promiscuous Actions? An Investigation by Chemical Bioinformatics.

    PubMed

    Tang, Guang-Yan

    2016-05-01

    Despite their diverse pharmacological effects, polyphenols are poor for use as drugs, which have been traditionally ascribed to their low bioavailability. However, Baell and co-workers recently proposed that the redox potential of polyphenols also plays an important role in this, because redox reactions bring promiscuous actions on various protein targets and thus produce non-specific pharmacological effects. To investigate whether the redox reactivity behaves as a critical factor in polyphenol promiscuity, we performed a chemical bioinformatics analysis on the structure-activity relationships of twenty polyphenols. It was found that the gene expression profiles of human cell lines induced by polyphenols were not correlated with the presence or not of redox moieties in the polyphenols, but significantly correlated with their molecular structures. Therefore, it is concluded that the promiscuous actions of polyphenols are likely to result from their inherent structural features rather than their redox potential.

  19. Machine learning: an indispensable tool in bioinformatics.

    PubMed

    Inza, Iñaki; Calvo, Borja; Armañanzas, Rubén; Bengoetxea, Endika; Larrañaga, Pedro; Lozano, José A

    2010-01-01

    The increase in the number and complexity of biological databases has raised the need for modern and powerful data analysis tools and techniques. In order to fulfill these requirements, the machine learning discipline has become an everyday tool in bio-laboratories. The use of machine learning techniques has been extended to a wide spectrum of bioinformatics applications. It is broadly used to investigate the underlying mechanisms and interactions between biological molecules in many diseases, and it is an essential tool in any biomarker discovery process. In this chapter, we provide a basic taxonomy of machine learning algorithms, and the characteristics of main data preprocessing, supervised classification, and clustering techniques are shown. Feature selection, classifier evaluation, and two supervised classification topics that have a deep impact on current bioinformatics are presented. We make the interested reader aware of a set of popular web resources, open source software tools, and benchmarking data repositories that are frequently used by the machine learning community.

  20. A toolbox for developing bioinformatics software

    PubMed Central

    Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.

    2012-01-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  1. Discovery and Classification of Bioinformatics Web Services

    SciTech Connect

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  2. A toolbox for developing bioinformatics software.

    PubMed

    Rother, Kristian; Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M

    2012-03-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers.

  3. Translational bioinformatics applications in genome medicine

    PubMed Central

    2009-01-01

    Although investigators using methodologies in bioinformatics have always been useful in genomic experimentation in analytic, engineering, and infrastructure support roles, only recently have bioinformaticians been able to have a primary scientific role in asking and answering questions on human health and disease. Here, I argue that this shift in role towards asking questions in medicine is now the next step needed for the field of bioinformatics. I outline four reasons why bioinformaticians are newly enabled to drive the questions in primary medical discovery: public availability of data, intersection of data across experiments, commoditization of methods, and streamlined validation. I also list four recommendations for bioinformaticians wishing to get more involved in translational research. PMID:19566916

  4. Genomics and Bioinformatics of Parkinson's Disease

    PubMed Central

    Scholz, Sonja W.; Mhyre, Tim; Ressom, Habtom; Shah, Salim; Federoff, Howard J.

    2012-01-01

    Within the last two decades, genomics and bioinformatics have profoundly impacted our understanding of the molecular mechanisms of Parkinson's disease (PD). From the description of the first PD gene in 1997 until today, we have witnessed the emergence of new technologies that have revolutionized our concepts to identify genetic mechanisms implicated in human health and disease. Driven by the publication of the human genome sequence and followed by the description of detailed maps for common genetic variability, novel applications to rapidly scrutinize the entire genome in a systematic, cost-effective manner have become a reality. As a consequence, about 30 genetic loci have been unequivocally linked to the pathogenesis of PD highlighting essential molecular pathways underlying this common disorder. Herein we discuss how neurogenomics and bioinformatics are applied to dissect the nature of this complex disease with the overall aim of developing rational therapeutic interventions. PMID:22762024

  5. [Applied problems of mathematical biology and bioinformatics].

    PubMed

    Lakhno, V D

    2011-01-01

    Mathematical biology and bioinformatics represent a new and rapidly progressing line of investigations which emerged in the course of work on the project "Human genome". The main applied problems of these sciences are grug design, patient-specific medicine and nanobioelectronics. It is shown that progress in the technology of mass sequencing of the human genome has set the stage for starting the national program on patient-specific medicine.

  6. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  7. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  8. Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.

    1996-01-01

    Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.

  9. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    PubMed

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease.

  10. Bringing Web 2.0 to bioinformatics.

    PubMed

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  11. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  12. Comparing connected structures in ensemble of random fields

    NASA Astrophysics Data System (ADS)

    Rongier, Guillaume; Collon, Pauline; Renard, Philippe; Straubhaar, Julien; Sausse, Judith

    2016-10-01

    Very different connectivity patterns may arise from using different simulation methods or sets of parameters, and therefore different flow properties. This paper proposes a systematic method to compare ensemble of categorical simulations from a static connectivity point of view. The differences of static connectivity cannot always be distinguished using two point statistics. In addition, multiple-point histograms only provide a statistical comparison of patterns regardless of the connectivity. Thus, we propose to characterize the static connectivity from a set of 12 indicators based on the connected components of the realizations. Some indicators describe the spatial repartition of the connected components, others their global shape or their topology through the component skeletons. We also gather all the indicators into dissimilarity values to easily compare hundreds of realizations. Heat maps and multidimensional scaling then facilitate the dissimilarity analysis. The application to a synthetic case highlights the impact of the grid size on the connectivity and the indicators. Such impact disappears when comparing samples of the realizations with the same sizes. The method is then able to rank realizations from a referring model based on their static connectivity. This application also gives rise to more practical advices. The multidimensional scaling appears as a powerful visualization tool, but it also induces dissimilarity misrepresentations: it should always be interpreted cautiously with a look at the point position confidence. The heat map displays the real dissimilarities and is more appropriate for a detailed analysis. The comparison with a multiple-point histogram method shows the benefit of the connected components: the large-scale connectivity seems better characterized by our indicators, especially the skeleton indicators.

  13. Bioinformatics for transporter pharmacogenomics and systems biology: data integration and modeling with UML.

    PubMed

    Yan, Qing

    2010-01-01

    Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields.

  14. Quantitative Analysis of the Trends Exhibited by the Three Interdisciplinary Biological Sciences: Biophysics, Bioinformatics, and Systems Biology.

    PubMed

    Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh

    2015-12-01

    New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.

  15. A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data.

    PubMed

    Roumpeka, Despoina D; Wallace, R John; Escalettes, Frank; Fotheringham, Ian; Watson, Mick

    2017-01-01

    The microbiome can be defined as the community of microorganisms that live in a particular environment. Metagenomics is the practice of sequencing DNA from the genomes of all organisms present in a particular sample, and has become a common method for the study of microbiome population structure and function. Increasingly, researchers are finding novel genes encoded within metagenomes, many of which may be of interest to the biotechnology and pharmaceutical industries. However, such "bioprospecting" requires a suite of sophisticated bioinformatics tools to make sense of the data. This review summarizes the most commonly used bioinformatics tools for the assembly and annotation of metagenomic sequence data with the aim of discovering novel genes.

  16. A Brief Review of Bioinformatics Tools for Glycosylation Analysis by Mass Spectrometry

    PubMed Central

    Tsai, Pei-Lun; Chen, Sung-Fang

    2017-01-01

    The purpose of this review is to provide updated information regarding bioinformatic software for the use in the characterization of glycosylated structures since 2013. A comprehensive review by Woodin et al. Analyst 138: 2793–2803, 2013 (ref. 1) described two main approaches that are introduced for starting researchers in this area; analysis of released glycans and the identification of glycopeptide in enzymatic digests, respectively. Complementary to that report, this review focuses on mass spectrometry related bioinformatics tools for the characterization of N-linked and O-linked glycopeptides. Specifically, it also provides information regarding automated tools that can be used for glycan profiling using mass spectrometry. PMID:28337402

  17. Quantum Bio-Informatics IV

    NASA Astrophysics Data System (ADS)

    Accardi, Luigi; Freudenberg, Wolfgang; Ohya, Masanori

    2011-01-01

    .Use of cryptographic ideas to interpret biological phenomena (and vice versa) / M. Regoli -- Discrete approximation to operators in white noise analysis / Si Si -- Bogoliubov type equations via infinite-dimensional equations for measures / V. V. Kozlov and O. G. Smolyanov -- Analysis of several categorical data using measure of proportional reduction in variation / K. Yamamoto ... [et al.] -- The electron reservoir hypothesis for two-dimensional electron systems / K. Yamada ... [et al.] -- On the correspondence between Newtonian and functional mechanics / E. V. Piskovskiy and I. V. Volovich -- Quantile-quantile plots: An approach for the inter-species comparison of promoter architecture in eukaryotes / K. Feldmeier ... [et al.] -- Entropy type complexities in quantum dynamical processes / N. Watanabe -- A fair sampling test for Ekert protocol / G. Adenier, A. Yu. Khrennikov and N. Watanabe -- Brownian dynamics simulation of macromolecule diffusion in a protocell / T. Ando and J. Skolnick -- Signaling network of environmental sensing and adaptation in plants: Key roles of calcium ion / K. Kuchitsu and T. Kurusu -- NetzCope: A tool for displaying and analyzing complex networks / M. J. Barber, L. Streit and O. Strogan -- Study of HIV-1 evolution by coding theory and entropic chaos degree / K. Sato -- The prediction of botulinum toxin structure based on in silico and in vitro analysis / T. Suzuki and S. Miyazaki -- On the mechanism of D-wave high T[symbol] superconductivity by the interplay of Jahn-Teller physics and Mott physics / H. Ushio, S. Matsuno and H. Kamimura.

  18. Associations between Input and Outcome Variables in an Online High School Bioinformatics Instructional Program

    NASA Astrophysics Data System (ADS)

    Lownsbery, Douglas S.

    Quantitative data from a completed year of an innovative online high school bioinformatics instructional program were analyzed as part of a descriptive research study. The online instructional program provided the opportunity for high school students to develop content understandings of molecular genetics and to use sophisticated bioinformatics tools and methodologies to conduct authentic research. Quantitative data were analyzed to identify potential associations between independent program variables including implementation setting, gender, and student educational backgrounds and dependent variables indicating success in the program including completion rates for analyzing DNA clones and performance gains from pre-to-post assessments of bioinformatics knowledge. Study results indicate that understanding associations between student educational backgrounds and level of success may be useful for structuring collaborative learning groups and enhancing scaffolding and support during the program to promote higher levels of success for participating students.

  19. Experimental and bioinformatic approaches for interrogating protein-protein interactions to determine protein function.

    PubMed

    Droit, Arnaud; Poirier, Guy G; Hunter, Joanna M

    2005-04-01

    An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. One strategy to determine protein function is to identify the protein-protein interactions. The increasing use of high-throughput and large-scale bioinformatics-based studies has generated a massive amount of data stored in a number of different databases. A challenge for bioinformatics is to explore this disparate data and to uncover biologically relevant interactions and pathways. In parallel, there is clearly a need for the development of approaches that can predict novel protein-protein interaction networks in silico. Here, we present an overview of different experimental and bioinformatic methods to elucidate protein-protein interactions.

  20. Evolving Strategies for the Incorporation of Bioinformatics within the Undergraduate Cell Biology Curriculum

    ERIC Educational Resources Information Center

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…

  1. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  2. A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

    ERIC Educational Resources Information Center

    Medin, Carey L.; Nolin, Katie L.

    2011-01-01

    Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…

  3. Comparative jet wake structure and swimming performance of salps.

    PubMed

    Sutherland, Kelly R; Madin, Laurence P

    2010-09-01

    Salps are barrel-shaped marine invertebrates that swim by jet propulsion. Morphological variations among species and life-cycle stages are accompanied by differences in swimming mode. The goal of this investigation was to compare propulsive jet wakes and swimming performance variables among morphologically distinct salp species (Pegea confoederata, Weelia (Salpa) cylindrica, Cyclosalpa sp.) and relate swimming patterns to ecological function. Using a combination of in situ dye visualization and particle image velocimetry (PIV) measurements, we describe properties of the jet wake and swimming performance variables including thrust, drag and propulsive efficiency. Locomotion by all species investigated was achieved via vortex ring propulsion. The slow-swimming P. confoederata produced the highest weight-specific thrust (T=53 N kg(-1)) and swam with the highest whole-cycle propulsive efficiency (eta(wc)=55%). The fast-swimming W. cylindrica had the most streamlined body shape but produced an intermediate weight-specific thrust (T=30 N kg(-1)) and swam with an intermediate whole-cycle propulsive efficiency (eta(wc)=52%). Weak swimming performance variables in the slow-swimming C. affinis, including the lowest weight-specific thrust (T=25 N kg(-1)) and lowest whole-cycle propulsive efficiency (eta(wc)=47%), may be compensated by low energetic requirements. Swimming performance variables are considered in the context of ecological roles and evolutionary relationships.

  4. Hospital profitability and capital structure: a comparative analysis.

    PubMed Central

    Valvona, J; Sloan, F A

    1988-01-01

    This article compares the financial performance of hospitals by ownership type and of five publicly traded hospital companies with other industries, using such indicators as profit margins, return on equity (ROE) and total capitalization, and debt-to-equity ratios. We also examine stock returns to investors for the five hospital companies versus other industries, as well as the relative roles of debt and equity in new financing. Investor-owned hospitals had substantially greater margins and ROE than did other hospital types. In 1982, investor-owned chain hospitals had a ROE of 26 percent, 18 points above the average for all hospitals. Stock returns on the five selected hospital companies were more than twice as large as returns on other industries between 1972 and 1983. However, after 1983, returns for these companies fell dramatically in absolute terms and relative to other industries. We also found investor-owned hospitals to be much more highly levered than their government and voluntary counterparts, and more highly levered than other industries as well. PMID:3403274

  5. mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

    PubMed

    Bokulich, Nicholas A; Rideout, Jai Ram; Mercurio, William G; Shiffer, Arron; Wolfe, Benjamin; Maurice, Corinne F; Dutton, Rachel J; Turnbaugh, Peter J; Knight, Rob; Caporaso, J Gregory

    2016-01-01

    Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community.

  6. mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking

    PubMed Central

    Bokulich, Nicholas A.; Rideout, Jai Ram; Mercurio, William G.; Shiffer, Arron; Wolfe, Benjamin; Maurice, Corinne F.; Dutton, Rachel J.; Turnbaugh, Peter J.; Knight, Rob

    2016-01-01

    ABSTRACT Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community. PMID:27822553

  7. Translational Bioinformatics Approaches to Drug Development

    PubMed Central

    Readhead, Ben; Dudley, Joel

    2013-01-01

    Significance A majority of therapeutic interventions occur late in the pathological process, when treatment outcome can be less predictable and effective, highlighting the need for new precise and preventive therapeutic development strategies that consider genomic and environmental context. Translational bioinformatics is well positioned to contribute to the many challenges inherent in bridging this gap between our current reactive methods of healthcare delivery and the intent of precision medicine, particularly in the areas of drug development, which forms the focus of this review. Recent Advances A variety of powerful informatics methods for organizing and leveraging the vast wealth of available molecular measurements available for a broad range of disease contexts have recently emerged. These include methods for data driven disease classification, drug repositioning, identification of disease biomarkers, and the creation of disease network models, each with significant impacts on drug development approaches. Critical Issues An important bottleneck in the application of bioinformatics methods in translational research is the lack of investigators who are versed in both biomedical domains and informatics. Efforts to nurture both sets of competencies within individuals and to increase interfield visibility will help to accelerate the adoption and increased application of bioinformatics in translational research. Future Directions It is possible to construct predictive, multiscale network models of disease by integrating genotype, gene expression, clinical traits, and other multiscale measures using causal network inference methods. This can enable the identification of the “key drivers” of pathology, which may represent novel therapeutic targets or biomarker candidates that play a more direct role in the etiology of disease. PMID:24527359

  8. Critical Issues in Bioinformatics and Computing

    PubMed Central

    Kesh, Someswa; Raghupathi, Wullianallur

    2004-01-01

    This article provides an overview of the field of bioinformatics and its implications for the various participants. Next-generation issues facing developers (programmers), users (molecular biologists), and the general public (patients) who would benefit from the potential applications are identified. The goal is to create awareness and debate on the opportunities (such as career paths) and the challenges such as privacy that arise. A triad model of the participants' roles and responsibilities is presented along with the identification of the challenges and possible solutions. PMID:18066389

  9. Microbial bioinformatics for food safety and production.

    PubMed

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel; van Hijum, Sacha A F T

    2016-03-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput 'omics' technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety.

  10. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    NASA Technical Reports Server (NTRS)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  11. Multiobjective optimization in bioinformatics and computational biology.

    PubMed

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  12. Translational Bioinformatics: Past, Present, and Future

    PubMed Central

    Tenenbaum, Jessica D.

    2016-01-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contextualization of the term TBI, describes the discipline’s brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field. PMID:26876718

  13. Microbial bioinformatics for food safety and production

    PubMed Central

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel

    2016-01-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput ‘omics’ technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety. PMID:26082168

  14. Teaching the ABCs of bioinformatics: a brief introduction to the Applied Bioinformatics Course

    PubMed Central

    2014-01-01

    With the development of the Internet and the growth of online resources, bioinformatics training for wet-lab biologists became necessary as a part of their education. This article describes a one-semester course ‘Applied Bioinformatics Course’ (ABC, http://abc.cbi.pku.edu.cn/) that the author has been teaching to biological graduate students at the Peking University and the Chinese Academy of Agricultural Sciences for the past 13 years. ABC is a hands-on practical course to teach students to use online bioinformatics resources to solve biological problems related to their ongoing research projects in molecular biology. With a brief introduction to the background of the course, detailed information about the teaching strategies of the course are outlined in the ‘How to teach’ section. The contents of the course are briefly described in the ‘What to teach’ section with some real examples. The author wishes to share his teaching experiences and the online teaching materials with colleagues working in bioinformatics education both in local and international universities. PMID:24008274

  15. OpenHelix: bioinformatics education outside of a different box

    PubMed Central

    Mangan, Mary E.; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C.

    2010-01-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181

  16. The potential of translational bioinformatics approaches for pharmacology research.

    PubMed

    Li, Lang

    2015-10-01

    The field of bioinformatics has allowed the interpretation of massive amounts of biological data, ushering in the era of 'omics' to biomedical research. Its potential impact on pharmacology research is enormous and it has shown some emerging successes. A full realization of this potential, however, requires standardized data annotation for large health record databases and molecular data resources. Improved standardization will further stimulate the development of system pharmacology models, using translational bioinformatics methods. This new translational bioinformatics paradigm is highly complementary to current pharmacological research fields, such as personalized medicine, pharmacoepidemiology and drug discovery. In this review, I illustrate the application of transformational bioinformatics to research in numerous pharmacology subdisciplines.

  17. Data Compression Concepts and Algorithms and their Applications to Bioinformatics

    PubMed Central

    Nalbantog̃lu, Ö. U.; Russell, D.J.; Sayood, K.

    2009-01-01

    Data compression at its base is concerned with how information is organized in data. Understanding this organization can lead to efficient ways of representing the information and hence data compression. In this paper we review the ways in which ideas and approaches fundamental to the theory and practice of data compression have been used in the area of bioinformatics. We look at how basic theoretical ideas from data compression, such as the notions of entropy, mutual information, and complexity have been used for analyzing biological sequences in order to discover hidden patterns, infer phylogenetic relationships between organisms and study viral populations. Finally, we look at how inferred grammars for biological sequences have been used to uncover structure in biological sequences. PMID:20157640

  18. Bioinformatics for cancer immunology and immunotherapy.

    PubMed

    Charoentong, Pornpimol; Angelova, Mihaela; Efremova, Mirjana; Gallasch, Ralf; Hackl, Hubert; Galon, Jerome; Trajanoski, Zlatko

    2012-11-01

    Recent mechanistic insights obtained from preclinical studies and the approval of the first immunotherapies has motivated increasing number of academic investigators and pharmaceutical/biotech companies to further elucidate the role of immunity in tumor pathogenesis and to reconsider the role of immunotherapy. Additionally, technological advances (e.g., next-generation sequencing) are providing unprecedented opportunities to draw a comprehensive picture of the tumor genomics landscape and ultimately enable individualized treatment. However, the increasing complexity of the generated data and the plethora of bioinformatics methods and tools pose considerable challenges to both tumor immunologists and clinical oncologists. In this review, we describe current concepts and future challenges for the management and analysis of data for cancer immunology and immunotherapy. We first highlight publicly available databases with specific focus on cancer immunology including databases for somatic mutations and epitope databases. We then give an overview of the bioinformatics methods for the analysis of next-generation sequencing data (whole-genome and exome sequencing), epitope prediction tools as well as methods for integrative data analysis and network modeling. Mathematical models are powerful tools that can predict and explain important patterns in the genetic and clinical progression of cancer. Therefore, a survey of mathematical models for tumor evolution and tumor-immune cell interaction is included. Finally, we discuss future challenges for individualized immunotherapy and suggest how a combined computational/experimental approaches can lead to new insights into the molecular mechanisms of cancer, improved diagnosis, and prognosis of the disease and pinpoint novel therapeutic targets.

  19. Tools and collaborative environments for bioinformatics research

    PubMed Central

    Giugno, Rosalba; Pulvirenti, Alfredo

    2011-01-01

    Advanced research requires intensive interaction among a multitude of actors, often possessing different expertise and usually working at a distance from each other. The field of collaborative research aims to establish suitable models and technologies to properly support these interactions. In this article, we first present the reasons for an interest of Bioinformatics in this context by also suggesting some research domains that could benefit from collaborative research. We then review the principles and some of the most relevant applications of social networking, with a special attention to networks supporting scientific collaboration, by also highlighting some critical issues, such as identification of users and standardization of formats. We then introduce some systems for collaborative document creation, including wiki systems and tools for ontology development, and review some of the most interesting biological wikis. We also review the principles of Collaborative Development Environments for software and show some examples in Bioinformatics. Finally, we present the principles and some examples of Learning Management Systems. In conclusion, we try to devise some of the goals to be achieved in the short term for the exploitation of these technologies. PMID:21984743

  20. Comparative analysis of mt LSU rRNA secondary structures of Odonates: structural variability and phylogenetic signal.

    PubMed

    Misof, B; Fleck, G

    2003-12-01

    Secondary structures of the most conserved part of the mt 16S rRNA gene, domains IV and V, have been recently analysed in a comparative study. However, full secondary structures of the mt LSU rRNA molecule are published for only a few insect species. The present study presents full secondary structures of domains I, II, IV and V of Odonates and one representative of mayflies, Ephemera sp. The reconstructions are based on a comparative approach and minimal consensus structures derived from sequence alignments. The inferred structures exhibit remarkable similarities to the published Drosophila melanogaster model, which increases confidence in these structures. Structural variance within Odonates is homoplastic, and neighbour-joining trees based on tree edit distances do not correspond to any of the phylogenetically expected patterns. However, despite homoplastic quantitative structural variation, many similarities between Odonates and Ephemera sp. suggest promising character sets for higher order insect systematics that merit further investigations.

  1. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

    PubMed

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

    2015-04-01

    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  2. Developing sustainable software solutions for bioinformatics by the " Butterfly" paradigm.

    PubMed

    Ahmed, Zeeshan; Zeeshan, Saman; Dandekar, Thomas

    2014-01-01

    Software design and sustainable software engineering are essential for the long-term development of bioinformatics software. Typical challenges in an academic environment are short-term contracts, island solutions, pragmatic approaches and loose documentation. Upcoming new challenges are big data, complex data sets, software compatibility and rapid changes in data representation. Our approach to cope with these challenges consists of iterative intertwined cycles of development (" Butterfly" paradigm) for key steps in scientific software engineering. User feedback is valued as well as software planning in a sustainable and interoperable way. Tool usage should be easy and intuitive. A middleware supports a user-friendly Graphical User Interface (GUI) as well as a database/tool development independently. We validated the approach of our own software development and compared the different design paradigms in various software solutions.

  3. Comparative modeling: the state of the art and protein drug target structure prediction.

    PubMed

    Liu, Tianyun; Tang, Grace W; Capriotti, Emidio

    2011-07-01

    The goal of computational protein structure prediction is to provide three-dimensional (3D) structures with resolution comparable to experimental results. Comparative modeling, which predicts the 3D structure of a protein based on its sequence similarity to homologous structures, is the most accurate computational method for structure prediction. In the last two decades, significant progress has been made on comparative modeling methods. Using the large number of protein structures deposited in the Protein Data Bank (~65,000), automatic prediction pipelines are generating a tremendous number of models (~1.9 million) for sequences whose structures have not been experimentally determined. Accurate models are suitable for a wide range of applications, such as prediction of protein binding sites, prediction of the effect of protein mutations, and structure-guided virtual screening. In particular, comparative modeling has enabled structure-based drug design against protein targets with unknown structures. In this review, we describe the theoretical basis of comparative modeling, the available automatic methods and databases, and the algorithms to evaluate the accuracy of predicted structures. Finally, we discuss relevant applications in the prediction of important drug target proteins, focusing on the G protein-coupled receptor (GPCR) and protein kinase families.

  4. Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics

    PubMed Central

    Siegwald, Léa; Touzet, Hélène; Lemoine, Yves; Hot, David

    2017-01-01

    Targeted metagenomics, also known as metagenetics, is a high-throughput sequencing application focusing on a nucleotide target in a microbiome to describe its taxonomic content. A wide range of bioinformatics pipelines are available to analyze sequencing outputs, and the choice of an appropriate tool is crucial and not trivial. No standard evaluation method exists for estimating the accuracy of a pipeline for targeted metagenomics analyses. This article proposes an evaluation protocol containing real and simulated targeted metagenomics datasets, and adequate metrics allowing us to study the impact of different variables on the biological interpretation of results. This protocol was used to compare six different bioinformatics pipelines in the basic user context: Three common ones (mothur, QIIME and BMP) based on a clustering-first approach and three emerging ones (Kraken, CLARK and One Codex) using an assignment-first approach. This study surprisingly reveals that the effect of sequencing errors has a bigger impact on the results that choosing different amplified regions. Moreover, increasing sequencing throughput increases richness overestimation, even more so for microbiota of high complexity. Finally, the choice of the reference database has a bigger impact on richness estimation for clustering-first pipelines, and on correct taxa identification for assignment-first pipelines. Using emerging assignment-first pipelines is a valid approach for targeted metagenomics analyses, with a quality of results comparable to popular clustering-first pipelines, even with an error-prone sequencing technology like Ion Torrent. However, those pipelines are highly sensitive to the quality of databases and their annotations, which makes clustering-first pipelines still the only reliable approach for studying microbiomes that are not well described. PMID:28052134

  5. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    ERIC Educational Resources Information Center

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  6. Assessment of a Bioinformatics across Life Science Curricula Initiative

    ERIC Educational Resources Information Center

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  7. Is there room for ethics within bioinformatics education?

    PubMed

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  8. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    ERIC Educational Resources Information Center

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  9. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    PubMed

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.

  10. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    PubMed Central

    2011-01-01

    Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP) paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'). A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption). An add-on module ('NuBio') facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures) and functionality (e.g., to parse/write standard file formats). Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and includes extensive

  11. Wrapping and interoperating bioinformatics resources using CORBA.

    PubMed

    Stevens, R; Miller, C

    2000-02-01

    Bioinformaticians seeking to provide services to working biologists are faced with the twin problems of distribution and diversity of resources. Bioinformatics databases are distributed around the world and exist in many kinds of storage forms, platforms and access paradigms. To provide adequate services to biologists, these distributed and diverse resources have to interoperate seamlessly within single applications. The Common Object Request Broker Architecture (CORBA) offers one technical solution to these problems. The key component of CORBA is its use of object orientation as an intermediate form to translate between different representations. This paper concentrates on an explanation of object orientation and how it can be used to overcome the problems of distribution and diversity by describing the interfaces between objects.

  12. Bioinformatics Resources for MicroRNA Discovery

    PubMed Central

    Moore, Alyssa C.; Winkjer, Jonathan S.; Tseng, Tsai-Tien

    2015-01-01

    Biomarker identification is often associated with the diagnosis and evaluation of various diseases. Recently, the role of microRNA (miRNA) has been implicated in the development of diseases, particularly cancer. With the advent of next-generation sequencing, the amount of data on miRNA has increased tremendously in the last decade, requiring new bioinformatics approaches for processing and storing new information. New strategies have been developed in mining these sequencing datasets to allow better understanding toward the actions of miRNAs. As a result, many databases have also been established to disseminate these findings. This review focuses on several curated databases of miRNAs and their targets from both predicted and validated sources. PMID:26819547

  13. Bioinformatics approaches to single-cell analysis in developmental biology.

    PubMed

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology.

  14. E-MSD: an integrated data resource for bioinformatics.

    PubMed

    Golovin, A; Oldfield, T J; Tate, J G; Velankar, S; Barton, G J; Boutselakis, H; Dimitropoulos, D; Fillon, J; Hussain, A; Ionides, J M C; John, M; Keller, P A; Krissinel, E; McNeil, P; Naim, A; Newman, R; Pajon, A; Pineda, J; Rachedi, A; Copeland, J; Sitnov, A; Sobhany, S; Suarez-Uruena, A; Swaminathan, G J; Tagari, M; Tromm, S; Vranken, W; Henrick, K

    2004-01-01

    The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the Protein Data Bank (PDB) and to work towards the integration of various bioinformatics data resources. We have implemented a simple form-based interface that allows users to query the MSD directly. The MSD 'atlas pages' show all of the information in the MSD for a particular PDB entry. The group has designed new search interfaces aimed at specific areas of interest, such as the environment of ligands and the secondary structures of proteins. We have also implemented a novel search interface that begins to integrate separate MSD search services in a single graphical tool. We have worked closely with collaborators to build a new visualization tool that can present both structure and sequence data in a unified interface, and this data viewer is now used throughout the MSD services for the visualization and presentation of search results. Examples showcasing the functionality and power of these tools are available from tutorial webpages (http://www. ebi.ac.uk/msd-srv/docs/roadshow_tutorial/).

  15. E-MSD: an integrated data resource for bioinformatics

    PubMed Central

    Golovin, A.; Oldfield, T. J.; Tate, J. G.; Velankar, S.; Barton, G. J.; Boutselakis, H.; Dimitropoulos, D.; Fillon, J.; Hussain, A.; Ionides, J. M. C.; John, M.; Keller, P. A.; Krissinel, E.; McNeil, P.; Naim, A.; Newman, R.; Pajon, A.; Pineda, J.; Rachedi, A.; Copeland, J.; Sitnov, A.; Sobhany, S.; Suarez-Uruena, A.; Swaminathan, G. J.; Tagari, M.; Tromm, S.; Vranken, W.; Henrick, K.

    2004-01-01

    The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the Protein Data Bank (PDB) and to work towards the integration of various bioinformatics data resources. We have implemented a simple form-based interface that allows users to query the MSD directly. The MSD ‘atlas pages’ show all of the information in the MSD for a particular PDB entry. The group has designed new search interfaces aimed at specific areas of interest, such as the environment of ligands and the secondary structures of proteins. We have also implemented a novel search interface that begins to integrate separate MSD search services in a single graphical tool. We have worked closely with collaborators to build a new visualization tool that can present both structure and sequence data in a unified interface, and this data viewer is now used throughout the MSD services for the visualization and presentation of search results. Examples showcasing the functionality and power of these tools are available from tutorial webpages (http://www.ebi.ac.uk/msd-srv/docs/roadshow_tutorial/). PMID:14681397

  16. Fisher: a program for the detection of H/ACA snoRNAs using MFE secondary structure prediction and comparative genomics – assessment and update

    PubMed Central

    Freyhult, Eva; Edvardsson, Sverker; Tamas, Ivica; Moulton, Vincent; Poole, Anthony M

    2008-01-01

    Background The H/ACA family of small nucleolar RNAs (snoRNAs) plays a central role in guiding the pseudouridylation of ribosomal RNA (rRNA). In an effort to systematically identify the complete set of rRNA-modifying H/ACA snoRNAs from the genome sequence of the budding yeast, Saccharomyces cerevisiae, we developed a program – Fisher – and previously presented several candidate snoRNAs based on our analysis [1]. Findings In this report, we provide a brief update of this work, which was aborted after the publication of experimentally-identified snoRNAs [2] identical to candidates we had identified bioinformatically using Fisher. Our motivation for revisiting this work is to report on the status of the candidate snoRNAs described in [1], and secondly, to report that a modified version of Fisher together with the available multiple yeast genome sequences was able to correctly identify several H/ACA snoRNAs for modification sites not identified by the snoGPS program [3]. While we are no longer developing Fisher, we briefly consider the merits of the Fisher algorithm relative to snoGPS, which may be of use for workers considering pursuing a similar search strategy for the identification of small RNAs. The modified source code for Fisher is made available as supplementary material. Conclusion Our results confirm the validity of using minimum free energy (MFE) secondary structure prediction to guide comparative genomic screening for RNA families with few sequence constraints. PMID:18710502

  17. Toward a Model of Knowledge Structure and a Comparative Analysis of Knowledge Structure Measurement Techniques

    DTIC Science & Technology

    1991-09-01

    POLYCON, INDSCAL/SINDSCAL, KYST, and MULTISCALE). Polzella and Reid (1989) employed MDS techniques to discover differences in performance characteristics...Reasoning and the structure of knowledge in biochemistry. Instructional Science, 17, 57-76. Polzella , D. J., and Reid, G. B. (1989

  18. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    PubMed

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  19. A Comparative Structural Equation Modeling Investigation of the Relationships among Teaching, Cognitive and Social Presence

    ERIC Educational Resources Information Center

    Kozan, Kadir

    2016-01-01

    The present study investigated the relationships among teaching, cognitive, and social presence through several structural equation models to see which model would better fit the data. To this end, the present study employed and compared several different structural equation models because different models could fit the data equally well. Among…

  20. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics.

  1. Thriving in multidisciplinary research: advice for new bioinformatics students.

    PubMed

    Auerbach, Raymond K

    2012-09-01

    The sciences have seen a large increase in demand for students in bioinformatics and multidisciplinary fields in general. Many new educational programs have been created to satisfy this demand, but navigating these programs requires a non-traditional outlook and emphasizes working in teams of individuals with distinct yet complementary skill sets. Written from the perspective of a current bioinformatics student, this article seeks to offer advice to prospective and current students in bioinformatics regarding what to expect in their educational program, how multidisciplinary fields differ from more traditional paths, and decisions that they will face on the road to becoming successful, productive bioinformaticists.

  2. Monte Carlo modelling of photodynamic therapy treatments comparing clustered three dimensional tumour structures with homogeneous tissue structures

    NASA Astrophysics Data System (ADS)

    Campbell, C. L.; Wood, K.; Brown, C. T. A.; Moseley, H.

    2016-07-01

    We explore the effects of three dimensional (3D) tumour structures on depth dependent fluence rates, photodynamic doses (PDD) and fluorescence images through Monte Carlo radiation transfer modelling of photodynamic therapy. The aim with this work was to compare the commonly used uniform tumour densities with non-uniform densities to determine the importance of including 3D models in theoretical investigations. It was found that fractal 3D models resulted in deeper penetration on average of therapeutic radiation and higher PDD. An increase in effective treatment depth of 1 mm was observed for one of the investigated fractal structures, when comparing to the equivalent smooth model. Wide field fluorescence images were simulated, revealing information about the relationship between tumour structure and the appearance of the fluorescence intensity. Our models indicate that the 3D tumour structure strongly affects the spatial distribution of therapeutic light, the PDD and the wide field appearance of surface fluorescence images.

  3. Systems Biology: The Next Frontier for Bioinformatics

    PubMed Central

    Likić, Vladimir A.; McConville, Malcolm J.; Lithgow, Trevor; Bacic, Antony

    2010-01-01

    Biochemical systems biology augments more traditional disciplines, such as genomics, biochemistry and molecular biology, by championing (i) mathematical and computational modeling; (ii) the application of traditional engineering practices in the analysis of biochemical systems; and in the past decade increasingly (iii) the use of near-comprehensive data sets derived from ‘omics platform technologies, in particular “downstream” technologies relative to genome sequencing, including transcriptomics, proteomics and metabolomics. The future progress in understanding biological principles will increasingly depend on the development of temporal and spatial analytical techniques that will provide high-resolution data for systems analyses. To date, particularly successful were strategies involving (a) quantitative measurements of cellular components at the mRNA, protein and metabolite levels, as well as in vivo metabolic reaction rates, (b) development of mathematical models that integrate biochemical knowledge with the information generated by high-throughput experiments, and (c) applications to microbial organisms. The inevitable role bioinformatics plays in modern systems biology puts mathematical and computational sciences as an equal partner to analytical and experimental biology. Furthermore, mathematical and computational models are expected to become increasingly prevalent representations of our knowledge about specific biochemical systems. PMID:21331364

  4. Bioinformatic tools for microRNA dissection

    PubMed Central

    Akhtar, Most Mauluda; Micolucci, Luigina; Islam, Md Soriful; Olivieri, Fabiola; Procopio, Antonio Domenico

    2016-01-01

    Recently, microRNAs (miRNAs) have emerged as important elements of gene regulatory networks. MiRNAs are endogenous single-stranded non-coding RNAs (∼22-nt long) that regulate gene expression at the post-transcriptional level. Through pairing with mRNA, miRNAs can down-regulate gene expression by inhibiting translation or stimulating mRNA degradation. In some cases they can also up-regulate the expression of a target gene. MiRNAs influence a variety of cellular pathways that range from development to carcinogenesis. The involvement of miRNAs in several human diseases, particularly cancer, makes them potential diagnostic and prognostic biomarkers. Recent technological advances, especially high-throughput sequencing, have led to an exponential growth in the generation of miRNA-related data. A number of bioinformatic tools and databases have been devised to manage this growing body of data. We analyze 129 miRNA tools that are being used in diverse areas of miRNA research, to assist investigators in choosing the most appropriate tools for their needs. PMID:26578605

  5. Impacts of bioinformatics to medicinal chemistry.

    PubMed

    Chou, Kuo-Chen

    2015-01-01

    Facing the explosive growth of biological sequence data, such as those of protein/peptide and DNA/RNA, generated in the post-genomic age, many bioinformatical and mathematical approaches as well as physicochemical concepts have been introduced to timely derive useful informations from these biological sequences, in order to stimulate the development of medical science and drug design. Meanwhile, because of the rapid penetrations from these disciplines, medicinal chemistry is currently undergoing an unprecedented revolution. In this minireview, we are to summarize the progresses by focusing on the following six aspects. (1) Use the pseudo amino acid composition or PseAAC to predict various attributes of protein/peptide sequences that are useful for drug development. (2) Use pseudo oligonucleotide composition or PseKNC to do the same for DNA/RNA sequences. (3) Introduce the multi-label approach to study those systems where the constituent elements bear multiple characters and functions. (4) Utilize the graphical rules and "wenxiang" diagrams to analyze complicated biomedical systems. (5) Recent development in identifying the interactions of drugs with its various types of target proteins in cellular networking. (6) Distorted key theory and its application in developing peptide drugs.

  6. Advantages and disadvantages in usage of bioinformatic programs in promoter region analysis

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena E.; Skarzyńska, Agnieszka; Posyniak, Kacper; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew

    2015-09-01

    An important computational challenge is finding the regulatory elements across the promotor region. In this work we present the advantages and disadvantages from the application of different bioinformatics programs for localization of transcription factor binding sites in the upstream region of genes connected with sex determination in cucumber. We use PlantCARE, PlantPAN and SignalScan to find motifs in the promotor regions. The results have been compared and possible function of chosen motifs has been described.

  7. Creating Bioinformatic Workflows within the BioExtract Server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows generally require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community...

  8. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    PubMed Central

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  9. Bioinformatics opportunities for identification and study of medicinal plants

    PubMed Central

    Sharma, Vivekanand

    2013-01-01

    Plants have been used as a source of medicine since historic times and several commercially important drugs are of plant-based origin. The traditional approach towards discovery of plant-based drugs often times involves significant amount of time and expenditure. These labor-intensive approaches have struggled to keep pace with the rapid development of high-throughput technologies. In the era of high volume, high-throughput data generation across the biosciences, bioinformatics plays a crucial role. This has generally been the case in the context of drug designing and discovery. However, there has been limited attention to date to the potential application of bioinformatics approaches that can leverage plant-based knowledge. Here, we review bioinformatics studies that have contributed to medicinal plants research. In particular, we highlight areas in medicinal plant research where the application of bioinformatics methodologies may result in quicker and potentially cost-effective leads toward finding plant-based remedies. PMID:22589384

  10. A comparative study of theoretical graph models for characterizing structural networks of human brain.

    PubMed

    Li, Xiaojin; Hu, Xintao; Jin, Changfeng; Han, Junwei; Liu, Tianming; Guo, Lei; Hao, Wei; Li, Lingjiang

    2013-01-01

    Previous studies have investigated both structural and functional brain networks via graph-theoretical methods. However, there is an important issue that has not been adequately discussed before: what is the optimal theoretical graph model for describing the structural networks of human brain? In this paper, we perform a comparative study to address this problem. Firstly, large-scale cortical regions of interest (ROIs) are localized by recently developed and validated brain reference system named Dense Individualized Common Connectivity-based Cortical Landmarks (DICCCOL) to address the limitations in the identification of the brain network ROIs in previous studies. Then, we construct structural brain networks based on diffusion tensor imaging (DTI) data. Afterwards, the global and local graph properties of the constructed structural brain networks are measured using the state-of-the-art graph analysis algorithms and tools and are further compared with seven popular theoretical graph models. In addition, we compare the topological properties between two graph models, namely, stickiness-index-based model (STICKY) and scale-free gene duplication model (SF-GD), that have higher similarity with the real structural brain networks in terms of global and local graph properties. Our experimental results suggest that among the seven theoretical graph models compared in this study, STICKY and SF-GD models have better performances in characterizing the structural human brain network.

  11. A web services choreography scenario for interoperating bioinformatics applications

    PubMed Central

    de Knikker, Remko; Guo, Youjun; Li, Jin-long; Kwan, Albert KH; Yip, Kevin Y; Cheung, David W; Cheung, Kei-Hoi

    2004-01-01

    Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine-friendly, 3) they use a non-standard format for data input and output, 4) they do not exploit standards to define application interface and message exchange, and 5) existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD) that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH) category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates with these web

  12. Proceedings: the Applications of Bioinformatics in Cancer Detection Workshop.

    PubMed

    Kapetanovic, Izet M; Umar, Asad; Khan, Javed

    2004-05-01

    The Division of Cancer Prevention of the National Cancer Institute sponsored and organized the Applications of Bioinformatics in Cancer Detection Workshop on August 6-7, 2002. The goal of the workshop was to evaluate the state of the science of bioinformatics and determine how it may be used to assist early cancer detection, risk identification, risk assessment, and risk reduction. This paper summarizes the proceedings of this conference and points out future directions for research.

  13. Whale song analyses using bioinformatics sequence analysis approaches

    NASA Astrophysics Data System (ADS)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  14. A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data

    PubMed Central

    Roumpeka, Despoina D.; Wallace, R. John; Escalettes, Frank; Fotheringham, Ian; Watson, Mick

    2017-01-01

    The microbiome can be defined as the community of microorganisms that live in a particular environment. Metagenomics is the practice of sequencing DNA from the genomes of all organisms present in a particular sample, and has become a common method for the study of microbiome population structure and function. Increasingly, researchers are finding novel genes encoded within metagenomes, many of which may be of interest to the biotechnology and pharmaceutical industries. However, such “bioprospecting” requires a suite of sophisticated bioinformatics tools to make sense of the data. This review summarizes the most commonly used bioinformatics tools for the assembly and annotation of metagenomic sequence data with the aim of discovering novel genes. PMID:28321234

  15. Expanding our understanding of sequence-function relationships of type II polyketide biosynthetic gene clusters: bioinformatics-guided identification of Frankiamicin A from Frankia sp. EAN1pec.

    PubMed

    Ogasawara, Yasushi; Yackley, Benjamin J; Greenberg, Jacob A; Rogelj, Snezna; Melançon, Charles E

    2015-01-01

    A large and rapidly increasing number of unstudied "orphan" natural product biosynthetic gene clusters are being uncovered in sequenced microbial genomes. An important goal of modern natural products research is to be able to accurately predict natural product structures and biosynthetic pathways from these gene cluster sequences. This requires both development of bioinformatic methods for global analysis of these gene clusters and experimental characterization of select products produced by gene clusters with divergent sequence characteristics. Here, we conduct global bioinformatic analysis of all available type II polyketide gene cluster sequences and identify a conserved set of gene clusters with unique ketosynthase α/β sequence characteristics in the genomes of Frankia species, a group of Actinobacteria with underexploited natural product biosynthetic potential. Through LC-MS profiling of extracts from several Frankia species grown under various conditions, we identified Frankia sp. EAN1pec as producing a compound with spectral characteristics consistent with the type II polyketide produced by this gene cluster. We isolated the compound, a pentangular polyketide which we named frankiamicin A, and elucidated its structure by NMR and labeled precursor feeding. We also propose biosynthetic and regulatory pathways for frankiamicin A based on comparative genomic analysis and literature precedent, and conduct bioactivity assays of the compound. Our findings provide new information linking this set of Frankia gene clusters with the compound they produce, and our approach has implications for accurate functional prediction of the many other type II polyketide clusters present in bacterial genomes.

  16. Life comparative analysis of energy consumption and CO₂ emissions of different building structural frame types.

    PubMed

    Kim, Sangyong; Moon, Joon-Ho; Shin, Yoonseok; Kim, Gwang-Hee; Seo, Deok-Seok

    2013-01-01

    The objective of this research is to quantitatively measure and compare the environmental load and construction cost of different structural frame types. Construction cost also accounts for the costs of CO₂ emissions of input materials. The choice of structural frame type is a major consideration in construction, as this element represents about 33% of total building construction costs. In this research, four constructed buildings were analyzed, with these having either reinforced concrete (RC) or steel (S) structures. An input-output framework analysis was used to measure energy consumption and CO₂ emissions of input materials for each structural frame type. In addition, the CO₂ emissions cost was measured using the trading price of CO₂ emissions on the International Commodity Exchange. This research revealed that both energy consumption and CO₂ emissions were, on average, 26% lower with the RC structure than with the S structure, and the construction costs (including the CO₂ emissions cost) of the RC structure were about 9.8% lower, compared to the S structure. This research provides insights through which the construction industry will be able to respond to the carbon market, which is expected to continue to grow in the future.

  17. Life Comparative Analysis of Energy Consumption and CO2 Emissions of Different Building Structural Frame Types

    PubMed Central

    Kim, Sangyong; Moon, Joon-Ho; Shin, Yoonseok; Kim, Gwang-Hee; Seo, Deok-Seok

    2013-01-01

    The objective of this research is to quantitatively measure and compare the environmental load and construction cost of different structural frame types. Construction cost also accounts for the costs of CO2 emissions of input materials. The choice of structural frame type is a major consideration in construction, as this element represents about 33% of total building construction costs. In this research, four constructed buildings were analyzed, with these having either reinforced concrete (RC) or steel (S) structures. An input-output framework analysis was used to measure energy consumption and CO2 emissions of input materials for each structural frame type. In addition, the CO2 emissions cost was measured using the trading price of CO2 emissions on the International Commodity Exchange. This research revealed that both energy consumption and CO2 emissions were, on average, 26% lower with the RC structure than with the S structure, and the construction costs (including the CO2 emissions cost) of the RC structure were about 9.8% lower, compared to the S structure. This research provides insights through which the construction industry will be able to respond to the carbon market, which is expected to continue to grow in the future. PMID:24227998

  18. Bioinformatic prediction of the epitopes of Echinococcus granulosus antigen 5

    PubMed Central

    Pan, Wei; Chen, De-Sheng; Lu, Yun-Juan; Sun, Fen-Fen; Xu, Hui-Wen; Zhang, Ya-Wen; Yan, Chao; Fu, Lin-Lin; Zheng, Kui-Yang; Tang, Ren-Xian

    2017-01-01

    The aim of the present study was to predict and analyze the secondary structure, and B and T cell epitopes of Echinococcus granulosus antigen 5 (Ag5) using online software in order to investigate its immunogenicity and preliminarily evaluate its potential as an effective antigen peptide vaccine for cystic echinococcosis. The PortParam program was used to analyze molecular weight, the theoretical isoelectric point, instability index and other physicochemical properties. The secondary structure of the Ag5 protein was predicted using Self-Optimized Prediction method With Alignment and the tertiary structure of the Ag5 protein was predicted using 3DLigandSite together with Center for Biological Sequence Analysis Prediction Servers. Furthermore, the Immune Epitope Database software was used to predict B cell epitopes, and T cell epitopes were predicted with the BioInformatics and Molecular Analysis Section and SYFPEITHI programs. The results demonstrated that α-helixes, β-turns, random coils and extended strands account for 23.35, 10.95, 41.32, and 24.38% of the secondary structure of the Ag5 protein, respectively. Ten potential B cell epitopes of Ag5 were identified as the amino acids sequences 27–39, 70–80, 117–130, 146–168, 250–262, 284–293, 339–349, 359–371, 403–412 and 454–462, and seven potential T cell epitopes were identified as the amino acid sequences 52–60, 57–65, 182–190, 231–239, 273–281, 318–326 and 467–475. Thus, ten B cell epitopes and seven T cell epitopes were identified on Ag5, suggesting the strong immunogenicity of this protein, which could be applied to design antigen peptide vaccines for echinococcosis.

  19. A comparative overview of modal testing and system identification for control of structures

    NASA Technical Reports Server (NTRS)

    Juang, J.-N.; Pappa, R. S.

    1988-01-01

    A comparative overview is presented of the disciplines of modal testing used in structural engineering and system identification used in control theory. A list of representative references from both areas is given, and the basic methods are described briefly. Recent progress on the interaction of modal testing and control disciplines is discussed. It is concluded that combined efforts of researchers in both disciplines are required for unification of modal testing and system identification methods for control of flexible structures.

  20. Proteomic and bioinformatic analyses of spinal cord injury-induced skeletal muscle atrophy in rats

    PubMed Central

    WEI, ZHI-JIAN; ZHOU, XIAN-HU; FAN, BAO-YOU; LIN, WEI; REN, YI-MING; FENG, SHI-QING

    2016-01-01

    Spinal cord injury (SCI) may result in skeletal muscle atrophy. Identifying diagnostic biomarkers and effective targets for treatment is an important challenge in clinical work. The aim of the present study is to elucidate potential biomarkers and therapeutic targets for SCI-induced muscle atrophy (SIMA) using proteomic and bioinformatic analyses. The protein samples from rat soleus muscle were collected at different time points following SCI injury and separated by two-dimensional gel electrophoresis and compared with the sham group. The identities of these protein spots were analyzed by mass spectrometry (MS). MS demonstrated that 20 proteins associated with muscle atrophy were differentially expressed. Bioinformatic analyses indicated that SIMA changed the expression of proteins associated with cellular, developmental, immune system and metabolic processes, biological adhesion and localization. The results of the present study may be beneficial in understanding the molecular mechanisms of SIMA and elucidating potential biomarkers and targets for the treatment of muscle atrophy. PMID:27177391

  1. Optimizing selection of microsatellite loci from 454 pyrosequencing via post-sequencing bioinformatic analyses.

    PubMed

    Fernandez-Silva, Iria; Toonen, Robert J

    2013-01-01

    The comparatively low cost of massive parallel sequencing technology, also known as next-generation sequencing (NGS), has transformed the isolation of microsatellite loci. The most common NGS approach consists of obtaining large amounts of sequence data from genomic DNA or enriched microsatellite libraries, which is then mined for the discovery of microsatellite repeats using bioinformatics analyses. Here, we describe a bioinformatics approach to isolate microsatellite loci, starting from the raw sequence data through a subset of microsatellite primer pairs. The primary difference to previously published approaches includes analyses to select the most accurate sequence data and to eliminate repetitive elements prior to the design of primers. These analyses aim to minimize the testing of primer pairs by identifying the most promising microsatellite loci.

  2. Bioinformatic Characterization of Glycyl Radical Enzyme-Associated Bacterial Microcompartments

    PubMed Central

    Zarzycki, Jan; Erbilgin, Onur

    2015-01-01

    Bacterial microcompartments (BMCs) are proteinaceous organelles encapsulating enzymes that catalyze sequential reactions of metabolic pathways. BMCs are phylogenetically widespread; however, only a few BMCs have been experimentally characterized. Among them are the carboxysomes and the propanediol- and ethanolamine-utilizing microcompartments, which play diverse metabolic and ecological roles. The substrate of a BMC is defined by its signature enzyme. In catabolic BMCs, this enzyme typically generates an aldehyde. Recently, it was shown that the most prevalent signature enzymes encoded by BMC loci are glycyl radical enzymes, yet little is known about the function of these BMCs. Here we characterize the glycyl radical enzyme-associated microcompartment (GRM) loci using a combination of bioinformatic analyses and active-site and structural modeling to show that the GRMs comprise five subtypes. We predict distinct functions for the GRMs, including the degradation of choline, propanediol, and fuculose phosphate. This is the first family of BMCs for which identification of the signature enzyme is insufficient for predicting function. The distinct GRM functions are also reflected in differences in shell composition and apparently different assembly pathways. The GRMs are the counterparts of the vitamin B12-dependent propanediol- and ethanolamine-utilizing BMCs, which are frequently associated with virulence. This study provides a comprehensive foundation for experimental investigations of the diverse roles of GRMs. Understanding this plasticity of function within a single BMC family, including characterization of differences in permeability and assembly, can inform approaches to BMC bioengineering and the design of therapeutics. PMID:26407889

  3. Structural Complexity of DNA Sequence

    PubMed Central

    Liou, Cheng-Yuan; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  4. Integrating bioinformatics into senior high school: design principles and implications.

    PubMed

    Machluf, Yossy; Yarden, Anat

    2013-09-01

    Bioinformatics is an integral part of modern life sciences. It has revolutionized and redefined how research is carried out and has had an enormous impact on biotechnology, medicine, agriculture and related areas. Yet, it is only rarely integrated into high school teaching and learning programs, playing almost no role in preparing the next generation of information-oriented citizens. Here, we describe the design principles of bioinformatics learning environments, including our own, that are aimed at introducing bioinformatics into senior high school curricula through engaging learners in scientifically authentic inquiry activities. We discuss the bioinformatics-related benefits and challenges that high school teachers and students face in the course of the implementation process, in light of previous studies and our own experience. Based on these lessons, we present a new approach for characterizing the questions embedded in bioinformatics teaching and learning units, based on three criteria: the type of domain-specific knowledge required to answer each question (declarative knowledge, procedural knowledge, strategic knowledge, situational knowledge), the scientific approach from which each question stems (biological, bioinformatics, a combination of the two) and the associated cognitive process dimension (remember, understand, apply, analyze, evaluate, create). We demonstrate the feasibility of this approach using a learning environment, which we developed for the high school level, and suggest some of its implications. This review sheds light on unique and critical characteristics related to broader integration of bioinformatics in secondary education, which are also relevant to the undergraduate level, and especially on curriculum design, development of suitable learning environments and teaching and learning processes.

  5. Gender Differences in Structured Risk Assessment: Comparing the Accuracy of Five Instruments

    ERIC Educational Resources Information Center

    Coid, Jeremy; Yang, Min; Ullrich, Simone; Zhang, Tianqiang; Sizmur, Steve; Roberts, Colin; Farrington, David P.; Rogers, Robert D.

    2009-01-01

    Structured risk assessment should guide clinical risk management, but it is uncertain which instrument has the highest predictive accuracy among men and women. In the present study, the authors compared the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991, 2003); the Historical, Clinical, Risk Management-20 (HCR-20; C. D. Webster, K. S.…

  6. Belief Structure and Foreign Policy: Comparing Dimensions of Elite and Mass Opinion.

    ERIC Educational Resources Information Center

    Oldendick, Robert, And Others

    1981-01-01

    This study compares the dimensions of elite (political leaders) and mass (general public) opinions towards foreign policy. A factor analysis revealed the elite's five major concerns (internationalism, international organization, Americanism, defense spending, and interventionism) and an overall preference for structured approaches. The masses also…

  7. Estimating, Testing, and Comparing Specific Effects in Structural Equation Models: The Phantom Model Approach

    ERIC Educational Resources Information Center

    Macho, Siegfried; Ledermann, Thomas

    2011-01-01

    The phantom model approach for estimating, testing, and comparing specific effects within structural equation models (SEMs) is presented. The rationale underlying this novel method consists in representing the specific effect to be assessed as a total effect within a separate latent variable model, the phantom model that is added to the main…

  8. Associational Structure and Community Development: A Comparative Study of Two Communities

    ERIC Educational Resources Information Center

    Dasgupta, Satadal

    1974-01-01

    The two communities compared tended to support the proposition that communities following an integrative style of development are characterized by coordinative structures including associational, while the contrary is true for communities following the autonomous style. Available from: Editorial and Business Offices, Piazza Cavalieri di Malta, 2,…

  9. Comparing Religious Education in Canadian and Australian Catholic High Schools: Identifying Some Key Structural Issues

    ERIC Educational Resources Information Center

    Rymarz, Richard

    2013-01-01

    Religious education (RE) in Catholic high schools in Australia and Canada is compared by examining some of the underlying structural factors that shape the delivery of RE. It is argued that in Canadian Catholic schools RE is diminished by three factors that distinguish it from the Australian experience. These are: the level and history of…

  10. Linguistic Structure and Non-linguistic Cognition: English and Russian Blues Compared.

    ERIC Educational Resources Information Center

    Laws, Glynis; And Others

    1995-01-01

    Investigates the influence of linguistic structure on non-linguistic cognition by comparing Russian and English behavior on tasks involving the color blue. Russians, who differentiate this region into "dark blue" and "light blue," were expected to separate blues more often than English subjects for whom the colors belong to one lexical category.…

  11. The Company That Words Keep: Comparing the Statistical Structure of Child- versus Adult-Directed Language

    ERIC Educational Resources Information Center

    Hills, Thomas

    2013-01-01

    Does child-directed language differ from adult-directed language in ways that might facilitate word learning? Associative structure (the probability that a word appears with its free associates), contextual diversity, word repetitions and frequency were compared longitudinally across six language corpora, with four corpora of language directed at…

  12. Molecular docking of Glycine max and Medicago truncatula ureases with urea; bioinformatics approaches.

    PubMed

    Filiz, Ertugrul; Vatansever, Recep; Ozyigit, Ibrahim Ilker

    2016-03-01

    Urease (EC 3.5.1.5) is a nickel-dependent metalloenzyme catalyzing the hydrolysis of urea into ammonia and carbon dioxide. It is present in many bacteria, fungi, yeasts and plants. Most species, with few exceptions, use nickel metalloenzyme urease to hydrolyze urea, which is one of the commonly used nitrogen fertilizer in plant growth thus its enzymatic hydrolysis possesses vital importance in agricultural practices. Considering the essentiality and importance of urea and urease activity in most plants, this study aimed to comparatively investigate the ureases of two important legume species such as Glycine max (soybean) and Medicago truncatula (barrel medic) from Fabaceae family. With additional plant species, primary and secondary structures of 37 plant ureases were comparatively analyzed using various bioinformatics tools. A structure based phylogeny was constructed using predicted 3D models of G. max and M. truncatula, whose crystallographic structures are not available, along with three additional solved urease structures from Canavalia ensiformis (PDB: 4GY7), Bacillus pasteurii (PDB: 4UBP) and Klebsiella aerogenes (PDB: 1FWJ). In addition, urease structures of these species were docked with urea to analyze the binding affinities, interacting amino acids and atom distances in urease-urea complexes. Furthermore, mutable amino acids which could potentially affect the protein active site, stability and flexibility as well as overall protein stability were analyzed in urease structures of G. max and M. truncatula. Plant ureases demonstrated similar physico-chemical properties with 833-878 amino acid residues and 89.39-90.91 kDa molecular weight with mainly acidic (5.15-6.10 pI) nature. Four protein domain structures such as urease gamma, urease beta, urease alpha and amidohydro 1 characterized the plant ureases. Secondary structure of plant ureases also demonstrated conserved protein architecture, with predominantly α-helix and random coil structures. In

  13. Comparative studies on solar cell structures using zinc phthalocyanine and fullerenes

    NASA Astrophysics Data System (ADS)

    Egginger, M.; Koeppe, R.; Meghdadi, F.; Troshin, P. A.; Lyubovskaya, R. N.; Meissner, D.; Sariciftci, N. S.

    2006-04-01

    We compare different structures of organic solar cells based on zinc phthalocyanine (ZnPc) and fullerene derivatives as electron donor and acceptor materials, respectively. Bilayer devices are fabricated and characterized by current-voltage and spectrally resolved photocurrent measurements. In a novel approach, the ZnPc was combined with soluble fullerene derivatives. With a pyrrolidinofullerene bearing chelating pyrridyl-groups we observed a complexation between donor and acceptor molecules. Due to a favorable structuring of the donor-acceptor interface this leads to a significant enhancement of the solar cell performance compared to similar devices where no complexation takes place. Coevaporated bulk heterojunction mixed-layers are introduced between the pristine layers. In these optimized structures short circuit currents up to 13 mA/cm2 are observed. We investigate the voltage dependence of the spectrally resolved photocurrent of ZnPc / Buckminsterfullerene bilayer solar cells and interpret the results in terms of the Gartner model.

  14. Comparative Reliability of Structured Versus Unstructured Interviews in the Admission Process of a Residency Program

    PubMed Central

    Blouin, Danielle; Day, Andrew G.; Pavlov, Andrey

    2011-01-01

    Background Although never directly compared, structured interviews are reported as being more reliable than unstructured interviews. This study compared the reliability of both types of interview when applied to a common pool of applicants for positions in an emergency medicine residency program. Methods In 2008, one structured interview was added to the two unstructured interviews traditionally used in our resident selection process. A formal job analysis using the critical incident technique guided the development of the structured interview tool. This tool consisted of 7 scenarios assessing 4 of the domains deemed essential for success as a resident in this program. The traditional interview tool assessed 5 general criteria. In addition to these criteria, the unstructured panel members were asked to rate each candidate on the same 4 essential domains rated by the structured panel members. All 3 panels interviewed all candidates. Main outcomes were the overall, interitem, and interrater reliabilities, the correlations between interview panels, and the dimensionality of each interview tool. Results Thirty candidates were interviewed. The overall reliability reached 0.43 for the structured interview, and 0.81 and 0.71 for the unstructured interviews. Analyses of the variance components showed a high interrater, low interitem reliability for the structured interview, and a high interrater, high interitem reliability for the unstructured interviews. The summary measures from the 2 unstructured interviews were significantly correlated, but neither was correlated with the structured interview. Only the structured interview was multidimensional. Conclusions A structured interview did not yield a higher overall reliability than both unstructured interviews. The lower reliability is explained by a lower interitem reliability, which in turn is due to the multidimensionality of the interview tool. Both unstructured panels consistently rated a single dimension, even when

  15. Function and structure of inherently disordered proteins.

    PubMed

    Dunker, A Keith; Silman, Israel; Uversky, Vladimir N; Sussman, Joel L

    2008-12-01

    The application of bioinformatics methodologies to proteins inherently lacking 3D structure has brought increased attention to these macromolecules. Here topics concerning these proteins are discussed, including their prediction from amino acid sequence, their enrichment in eukaryotes compared to prokaryotes, their more rapid evolution compared to structured proteins, their organization into specific groups, their structural preferences, their half-lives in cells, their contributions to signaling diversity (via high contents of multiple-partner binding sites, post-translational modifications, and alternative splicing), their distinct functional repertoire compared to that of structured proteins, and their involvement in diseases.

  16. Comparative evaluation of structured oil systems: Shellac oleogel, HPMC oleogel, and HIPE gel.

    PubMed

    Patel, Ashok R; Dewettinck, Koen

    2015-11-01

    In lipid-based food products, fat crystals are used as building blocks for creating a crystalline network that can trap liquid oil into a 3D gel-like structure which in turn is responsible for the desirable mouth feel and texture properties of the food products. However, the recent ban on the use of trans-fat in the US, coupled with the increasing concerns about the negative health effects of saturated fat consumption, has resulted in an increased interest in the area of identifying alternative ways of structuring edible oils using non-fat-based building blocks. In this paper, we give a brief account of three alternative approaches where oil structuring was carried out using wax crystals (shellac), polymer strands (hydrophilic cellulose derivative), and emulsion droplets as structurants. These building blocks resulted in three different types of oleogels that showed distinct rheological properties and temperature functionalities. The three approaches are compared in terms of the preparation process (ease of processing), properties of the formed systems (microstructure, rheological gel strength, temperature response, effect of water incorporation, and thixotropic recovery), functionality, and associated limitations of the structured systems. The comparative evaluation is made such that the new researchers starting their work in the area of oil structuring can use this discussion as a general guideline.

  17. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    PubMed

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  18. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics.

    PubMed

    de Matos, Paula; Cham, Jennifer A; Cao, Hong; Alcántara, Rafael; Rowland, Francis; Lopez, Rodrigo; Steinbeck, Christoph

    2013-03-20

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users' requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios.For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature.We employed several UCD techniques, including: persona development, interviews, 'canvas sort' card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience.

  19. Statistical Power of Alternative Structural Models for Comparative Effectiveness Research: Advantages of Modeling Unreliability

    PubMed Central

    Iordache, Eugen; Dierker, Lisa; Fifield, Judith; Schensul, Jean J.; Suggs, Suzanne; Barbour, Russell

    2015-01-01

    The advantages of modeling the unreliability of outcomes when evaluating the comparative effectiveness of health interventions is illustrated. Adding an action-research intervention component to a regular summer job program for youth was expected to help in preventing risk behaviors. A series of simple two-group alternative structural equation models are compared to test the effect of the intervention on one key attitudinal outcome in terms of model fit and statistical power with Monte Carlo simulations. Some models presuming parameters equal across the intervention and comparison groups were underpowered to detect the intervention effect, yet modeling the unreliability of the outcome measure increased their statistical power and helped in the detection of the hypothesized effect. Comparative Effectiveness Research (CER) could benefit from flexible multi-group alternative structural models organized in decision trees, and modeling unreliability of measures can be of tremendous help for both the fit of statistical models to the data and their statistical power. PMID:26640421

  20. Structural and vibrational spectroscopic analysis of anticancer drug mitotane using DFT method; a comparative study of its parent structure

    NASA Astrophysics Data System (ADS)

    Mariappan, G.; Sundaraganesan, N.

    2015-04-01

    A comprehensive screening of the density functional theoretical approach to structural analysis is presented in this section. DFT calculations using B3LYP/6-311++G(d,p) level of theory were found to yield results that are very comparable to experimental IR and Raman spectra. Computed geometrical parameters and harmonic vibrational wavenumbers of the fundamentals were found in satisfactory agreement with the experimental data and also its parent structure. The vibrational assignments of the normal modes were performed on the basis of the potential energy distribution (PED) calculations. It can be proven from the comparative results of mitotane and its parent structure Dichlorodiphenyldichloroethane (DDD), the intramolecular nonbonding interaction between (C1sbnd H19⋯Cl18) in the ortho position which is calculated 2.583 Å and the position of the substitution takeover the vibrational wavenumber to redshift of 47 cm-1. In addition, natural bond orbital (NBO) analysis has been performed for analyzing charge delocalization throughout the molecule. Stability of the molecule arising from hyperconjugative interactions leading to its bioactivity and charge delocalization has been analyzed. 13C and 1H nuclear magnetic resonance chemical shifts of the molecule have been calculated using the gauge independent atomic orbital (GIAO) method and compared with published results.

  1. Structures, properties, and functions of the stings of honey bees and paper wasps: a comparative study

    PubMed Central

    Zhao, Zi-Long; Zhao, Hong-Ping; Ma, Guo-Jun; Wu, Cheng-Wei; Yang, Kai; Feng, Xi-Qiao

    2015-01-01

    ABSTRACT Through natural selection, many animal organs with similar functions have evolved different macroscopic morphologies and microscopic structures. Here, we comparatively investigate the structures, properties and functions of honey bee stings and paper wasp stings. Their elegant structures were systematically observed. To examine their behaviors of penetrating into different materials, we performed penetration–extraction tests and slow motion analyses of their insertion process. In comparison, the barbed stings of honey bees are relatively difficult to be withdrawn from fibrous tissues (e.g. skin), while the removal of paper wasp stings is easier due to their different structures and insertion skills. The similarities and differences of the two kinds of stings are summarized on the basis of the experiments and observations. PMID:26002929

  2. Vignettes: diverse library staff offering diverse bioinformatics services*

    PubMed Central

    Osterbur, David L.; Alpi, Kristine; Canevari, Catharine; Corley, Pamela M.; Devare, Medha; Gaedeke, Nicola; Jacobs, Donna K.; Kirlew, Peter; Ohles, Janet A.; Vaughan, K.T.L.; Wang, Lili; Wu, Yongchun; Geer, Renata C.

    2006-01-01

    Objectives: The paper gives examples of the bioinformatics services provided in a variety of different libraries by librarians with a broad range of educational background and training. Methods: Two investigators sent an email inquiry to attendees of the “National Center for Biotechnology Information's (NCBI) Introduction to Molecular Biology Information Resources” or “NCBI Advanced Workshop for Bioinformatics Information Specialists (NAWBIS)” courses. The thirty-five-item questionnaire addressed areas such as educational background, library setting, types and numbers of users served, and bioinformatics training and support services provided. Answers were compiled into program vignettes. Discussion: The bioinformatics support services addressed in the paper are based in libraries with academic and clinical settings. Services have been established through different means: in collaboration with biology faculty as part of formal courses, through teaching workshops in the library, through one-on-one consultations, and by other methods. Librarians with backgrounds from art history to doctoral degrees in genetics have worked to establish these programs. Conclusion: Successful bioinformatics support programs can be established in libraries in a variety of different settings and by staff with a variety of different backgrounds and approaches. PMID:16888664

  3. Robust High-dimensional Bioinformatics Data Streams Mining by ODR-ioVFDT

    PubMed Central

    Wang, Dantong; Fong, Simon; Wong, Raymond K.; Mohammed, Sabah; Fiaidhi, Jinan; Wong, Kelvin K. L.

    2017-01-01

    Outlier detection in bioinformatics data streaming mining has received significant attention by research communities in recent years. The problems of how to distinguish noise from an exception and deciding whether to discard it or to devise an extra decision path for accommodating it are causing dilemma. In this paper, we propose a novel algorithm called ODR with incrementally Optimized Very Fast Decision Tree (ODR-ioVFDT) for taking care of outliers in the progress of continuous data learning. By using an adaptive interquartile-range based identification method, a tolerance threshold is set. It is then used to judge if a data of exceptional value should be included for training or otherwise. This is different from the traditional outlier detection/removal approaches which are two separate steps in processing through the data. The proposed algorithm is tested using datasets of five bioinformatics scenarios and comparing the performance of our model and other ones without ODR. The results show that ODR-ioVFDT has better performance in classification accuracy, kappa statistics, and time consumption. The ODR-ioVFDT applied onto bioinformatics streaming data processing for detecting and quantifying the information of life phenomena, states, characters, variables and components of the organism can help to diagnose and treat disease more effectively. PMID:28230161

  4. Pipeliner: software to evaluate the performance of bioinformatics pipelines for next-generation resequencing.

    PubMed

    Nevado, B; Perez-Enciso, M

    2015-01-01

    The choice of technology and bioinformatics approach is critical in obtaining accurate and reliable information from next-generation sequencing (NGS) experiments. An increasing number of software and methodological guidelines are being published, but deciding upon which approach and experimental design to use can depend on the particularities of the species and on the aims of the study. This leaves researchers unable to produce informed decisions on these central questions. To address these issues, we developed pipeliner - a tool to evaluate, by simulation, the performance of NGS pipelines in resequencing studies. Pipeliner provides a graphical interface allowing the users to write and test their own bioinformatics pipelines with publicly available or custom software. It computes a number of statistics summarizing the performance in SNP calling, including the recovery, sensitivity and false discovery rate for heterozygous and homozygous SNP genotypes. Pipeliner can be used to answer many practical questions, for example, for a limited amount of NGS effort, how many more reliable SNPs can be detected by doubling coverage and halving sample size or what is the false discovery rate provided by different SNP calling algorithms and options. Pipeliner thus allows researchers to carefully plan their study's sampling design and compare the suitability of alternative bioinformatics approaches for their specific study systems. Pipeliner is written in C++ and is freely available from http://github.com/brunonevado/Pipeliner.

  5. A comparative study of Whi5 and retinoblastoma proteins: from sequence and structure analysis to intracellular networks

    PubMed Central

    Hasan, Md Mehedi; Brocca, Stefania; Sacco, Elena; Spinelli, Michela; Papaleo, Elena; Lambrughi, Matteo; Alberghina, Lilia; Vanoni, Marco

    2014-01-01

    Cell growth and proliferation require a complex series of tight-regulated and well-orchestrated events. Accordingly, proteins governing such events are evolutionary conserved, even among distant organisms. By contrast, it is more singular the case of “core functions” exerted by functional analogous proteins that are not homologous and do not share any kind of structural similarity. This is the case of proteins regulating the G1/S transition in higher eukaryotes–i.e., the retinoblastoma (Rb) tumor suppressor Rb—and budding yeast, i.e., Whi5. The interaction landscape of Rb and Whi5 is quite large, with more than one hundred proteins interacting either genetically or physically with each protein. The Whi5 interactome has been used to construct a concept map of Whi5 function and regulation. Comparison of physical and genetic interactors of Rb and Whi5 allows highlighting a significant core of conserved, common functionalities associated with the interactors indicating that structure and function of the network—rather than individual proteins—are conserved during evolution. A combined bioinformatics and biochemical approach has shown that the whole Whi5 protein is highly disordered, except for a small region containing the protein family signature. The comparison with Whi5 homologs from Saccharomycetales has prompted the hypothesis of a modular organization of structural disorder, with most evolutionary conserved regions alternating with highly variable ones. The finding of a consensus sequence points to the conservation of a specific phosphorylation rhythm along with two disordered sequence motifs, probably acting as phosphorylation-dependent seeds in Whi5 folding/unfolding. Thus, the widely disordered Whi5 appears to act as a hierarchical, “date hub” that has evolutionary assayed an original way of modular organization before being supplanted by the globular, multi-domain structured Rb, more suitable to cover the role of a “party hub”. PMID

  6. Comparative genomic analysis of equilibrative nucleoside transporters suggests conserved protein structure despite limited sequence identity.

    PubMed

    Sankar, Narendra; Machado, Jerry; Abdulla, Parween; Hilliker, Arthur J; Coe, Imogen R

    2002-10-15

    Equilibrative nucleoside transporters (ENTs) are a recently characterized and poorly understood group of membrane proteins that are important in the uptake of endogenous nucleosides required for nucleic acid and nucleoside triphosphate synthesis. Despite their central importance in cellular metabolism and nucleoside analog chemotherapy, no human ENT gene has been described and nothing is known about gene structure and function. To gain insight into the ENT gene family, we used experimental and in silico comparative genomic approaches to identify ENT genes in three evolutionarily diverse organisms with completely (or almost completely) sequenced genomes, Homo sapiens, Caenorhabditis elegans and Drosophila melanogaster. We describe the chromosomal location, the predicted ENT gene structure and putative structural topologies of predicted ENT proteins derived from the open reading frames. Despite variations in genomic layout and limited ortholog protein sequence identity (< or =27.45%), predicted topologies of ENT proteins are strikingly similar, suggesting an evolutionary conservation of a prototypic structure. In addition, a similar distribution of protein domains on exons is apparent in all three taxa. These data demonstrate that comparative sequence analyses should be combined with other approaches (such as genomic and proteomic analyses) to fully understand structure, function and evolution of protein families.

  7. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis

    PubMed Central

    Noar, Roslyn D.; Daub, Margaret E.

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  8. Bioinformatics projects supporting life-sciences learning in high schools.

    PubMed

    Marques, Isabel; Almeida, Paulo; Alves, Renato; Dias, Maria João; Godinho, Ana; Pereira-Leal, José B

    2014-01-01

    The interdisciplinary nature of bioinformatics makes it an ideal framework to develop activities enabling enquiry-based learning. We describe here the development and implementation of a pilot project to use bioinformatics-based research activities in high schools, called "Bioinformatics@school." It includes web-based research projects that students can pursue alone or under teacher supervision and a teacher training program. The project is organized so as to enable discussion of key results between students and teachers. After successful trials in two high schools, as measured by questionnaires, interviews, and assessment of knowledge acquisition, the project is expanding by the action of the teachers involved, who are helping us develop more content and are recruiting more teachers and schools.

  9. CryptoDB: a Cryptosporidium bioinformatics resource update.

    PubMed

    Heiges, Mark; Wang, Haiming; Robinson, Edward; Aurrecoechea, Cristina; Gao, Xin; Kaluskar, Nivedita; Rhodes, Philippa; Wang, Sammy; He, Cong-Zhou; Su, Yanqi; Miller, John; Kraemer, Eileen; Kissinger, Jessica C

    2006-01-01

    The database, CryptoDB (http://CryptoDB.org), is a community bioinformatics resource for the AIDS-related apicomplexan-parasite, Cryptosporidium. CryptoDB integrates whole genome sequence and annotation with expressed sequence tag and genome survey sequence data and provides supplemental bioinformatics analyses and data-mining tools. A simple, yet comprehensive web interface is available for mining and visualizing the data. CryptoDB is allied with the databases PlasmoDB and ToxoDB via ApiDB, an NIH/NIAID-fundedBioinformatics Resource Center. Recent updates to CryptoDB include the deposition of annotated genome sequences for Cryptosporidium parvum and Cryptosporidium hominis, migration to a relational database (GUS), a new query and visualization interface and the introduction of Web services.

  10. Building a bioinformatics community of practice through library education programs.

    PubMed

    Moore, Margaret E; Vaughan, K T L; Hayes, Barrie E

    2004-01-01

    This paper addresses the following questions:What makes the community of practice concept an intriguing framework for developing library services for bioinformatics? What is the campus context and setting? What has been the Health Sciences Library's role in bioinformatics at the University of North Carolina (UNC) Chapel Hill? What are the Health Sciences Library's goals? What services are currently offered? How will these services be evaluated and developed? How can libraries demonstrate their value? Providing library services for an emerging community such as bioinformatics and computational biology presents special challenges for libraries including understanding needs, defining and communicating the library's role, building relationships within the community, preparing staff, and securing funding. Like many academic health sciences libraries, the University of North Carolina (UNC) at Chapel Hill Health Sciences Library is addressing these challenges in the context of its overall mission and goals.

  11. Cellular automata and its applications in protein bioinformatics.

    PubMed

    Xiao, Xuan; Wang, Pu; Chou, Kuo-Chen

    2011-09-01

    With the explosion of protein sequences generated in the postgenomic era, it is highly desirable to develop high-throughput tools for rapidly and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. The knowledge thus obtained can help us timely utilize these newly found protein sequences for both basic research and drug discovery. Many bioinformatics tools have been developed by means of machine learning methods. This review is focused on the applications of a new kind of science (cellular automata) in protein bioinformatics. A cellular automaton (CA) is an open, flexible and discrete dynamic model that holds enormous potentials in modeling complex systems, in spite of the simplicity of the model itself. Researchers, scientists and practitioners from different fields have utilized cellular automata for visualizing protein sequences, investigating their evolution processes, and predicting their various attributes. Owing to its impressive power, intuitiveness and relative simplicity, the CA approach has great potential for use as a tool for bioinformatics.

  12. Embracing the Future: Bioinformatics for High School Women

    NASA Astrophysics Data System (ADS)

    Zales, Charlotte Rappe; Cronin, Susan J.

    Sixteen high school women participated in a 5-week residential summer program designed to encourage female and minority students to choose careers in scientific fields. Students gained expertise in bioinformatics through problem-based learning in a complex learning environment of content instruction, speakers, labs, and trips. Innovative hands-on activities filled the program. Students learned biological principles in context and sophisticated bioinformatics tools for processing data. Students additionally mastered a variety of information-searching techniques. Students completed creative individual and group projects, demonstrating the successful integration of biology, information technology, and bioinformatics. Discussions with female scientists allowed students to see themselves in similar roles. Summer residential aspects fostered an atmosphere in which students matured in interacting with others and in their views of diversity.

  13. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    PubMed

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  14. Drug Discovery and Structural Bioinformatics in Breast Cancer

    DTIC Science & Technology

    1999-12-01

    drug design and development. The biological focus of our research addresses estrogen biosynthesis and on estrogen- induced gene expression in hormone-dependent breast cancer. Identification of critical small molecule-protein and protein-protein interactions during gene expression and signal transduction in the areas of steroidogenesis and estrogen-induced responses will result in new molecular targets for drug discovery and design for the treatment of hormone-dependent breast cancer. The Sabbatical Training Grant provided an enhancement of our research endeavors by

  15. The leader peptide of mutacin 1140 has distinct structural components compared to related class I lantibiotics.

    PubMed

    Escano, Jerome; Stauffer, Byron; Brennan, Jacob; Bullock, Monica; Smith, Leif

    2014-12-01

    Lantibiotics are ribosomally synthesized peptide antibiotics composed of an N-terminal leader peptide that promotes the core peptide's interaction with the post translational modification (PTM) enzymes. Following PTMs, mutacin 1140 is transported out of the cell and the leader peptide is cleaved to yield the antibacterial peptide. Mutacin 1140 leader peptide is structurally unique compared to other class I lantibiotic leader peptides. Herein, we further our understanding of the structural differences of mutacin 1140 leader peptide with regard to other class I leader peptides. We have determined that the length of the leader peptide is important for the biosynthesis of mutacin 1140. We have also determined that mutacin 1140 leader peptide contains a novel four amino acid motif compared to related lantibiotics. PTM enzyme recognition of the leader peptide appears to be evolutionarily distinct from related class I lantibiotics. Our study on mutacin 1140 leader peptide provides a basis for future studies aimed at understanding its interaction with the PTM enzymes.

  16. Compare of the electronic structures of F- and Ir-doped SmFeAsO

    NASA Astrophysics Data System (ADS)

    Zhang, Y.; Cheng, C. H.; Chen, Y. L.; Cui, Y. J.; You, W. G.; Zhang, H.; Zhao, Y.

    2010-11-01

    The electronic structures of Fe-based superconductor SmFeAsO1-xFx and SmFe1-yIryAsO are compared through X-ray photoemission spectroscopy in this study. With fluorine or iridium doping, the electronic structure and chemical environment of the SmFeAsO system were changed. The fluorine was doped at an oxygen site which introduced electrons to a reservoir Sm-O layer. The iridium was doped at an Fe site which introduced electrons to a conduction Fe-As layer directly. In a parent material SmFeAsO, the magnetic ordering corresponding to Fe3d in the low-spin state is suppressed by both fluorine and iridium doping through suppressing the magnetism of 3d itinerant electrons. Compared to fluorine doping, iridium doping affected superconductivity more significantly due to an iridium-induced disorder in FeAs layers.

  17. Comparative internal structure of dorsal lips and radiolar appendages in Sabellidae (Polychaeta) and phylogenetic implications.

    PubMed

    Capa, María; Nogueira, João Miguel de Matos; Rossi, Maíra Cappellani Silva

    2011-03-01

    Fan worms (Sabellidae) possess paired modified prostomial structures at the base of the radiolar crown, dorso-lateral to the mouth, called dorsal lips. The dorsal lips are involved in the sorting of particles collected by the radiolar crown. The range of variation in the morphology of dorsal lips is extensive, and probably this is not only due to adaptations to different environments and feeding preferences but also due to phylogenetic constraints. In this study, we describe and compare the morphology of dorsal lips in a range of sabellid taxa based on histological cross-sections of these structures, and compare our data and terminology with those of previous studies. Dorsal lips are maintained erect in most taxa by a modified radiole fused to them known as dorsal radiolar appendage. We suggest that dorsal radiolar appendages with an internal supporting axis (cellular or acellular) and probably also the ventral lips are synapomorphies of the family.

  18. Comparative structural and energetic analysis of WW domain-peptide interactions.

    PubMed

    Schleinkofer, Karin; Wiedemann, Urs; Otte, Livia; Wang, Ting; Krause, Gerd; Oschkinat, Hartmut; Wade, Rebecca C

    2004-11-26

    WW domains are small globular protein interaction modules found in a wide spectrum of proteins. They recognize their target proteins by binding specifically to short linear peptide motifs that are often proline-rich. To infer the determinants of the ligand binding propensities of WW domains, we analyzed 42 WW domains. We built models of the 3D structures of the WW domains and their peptide complexes by comparative modeling supplemented with experimental data from peptide library screens. The models provide new insights into the orientation and position of the peptide in structures of WW domain-peptide complexes that have not yet been determined experimentally. From a protein interaction property similarity analysis (PIPSA) of the WW domain structures, we show that electrostatic potential is a distinguishing feature of WW domains and we propose a structure-based classification of WW domains that expands the existent ligand-based classification scheme. Application of the comparative molecular field analysis (CoMFA), GRID/GOLPE and comparative binding energy (COMBINE) analysis methods permitted the derivation of quantitative structure-activity relationships (QSARs) that aid in identifying the specificity-determining residues within WW domains and their ligand-recognition motifs. Using these QSARs, a new group-specific sequence feature of WW domains that target arginine-containing peptides was identified. Finally, the QSAR models were applied to the design of a peptide to bind with greater affinity than the known binding peptide sequences of the yRSP5-1 WW domain. The prediction was verified experimentally, providing validation of the QSAR models and demonstrating the possibility of rationally improving peptide affinity for WW domains. The QSAR models may also be applied to the prediction of the specificity of WW domains with uncharacterized ligand-binding properties.

  19. Agency and Structure as Determinants of Female Suicide Terrorism: A Comparative Study of Three Conflict Regions

    DTIC Science & Technology

    2009-12-01

    Agency and Structure as Determinants of Female Suicide Terrorism: A Comparative Study of Three Conflict Regions 6. AUTHOR( S ) Matthew P. Dearing 5...FUNDING NUMBERS 7. PERFORMING ORGANIZATION NAME( S ) AND ADDRESS(ES) Naval Postgraduate School Monterey, CA 93943-5000 8. PERFORMING ORGANIZATION...REPORT NUMBER 9. SPONSORING /MONITORING AGENCY NAME( S ) AND ADDRESS(ES) N/A 10. SPONSORING/MONITORING AGENCY REPORT NUMBER 11

  20. Naturally selecting solutions: the use of genetic algorithms in bioinformatics.

    PubMed

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2013-01-01

    For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.

  1. Bioinformatic scaling of allosteric interactions in biomedical isozymes

    NASA Astrophysics Data System (ADS)

    Phillips, J. C.

    2016-09-01

    Allosteric (long-range) interactions can be surprisingly strong in proteins of biomedical interest. Here we use bioinformatic scaling to connect prior results on nonsteroidal anti-inflammatory drugs to promising new drugs that inhibit cancer cell metabolism. Many parallel features are apparent, which explain how even one amino acid mutation, remote from active sites, can alter medical results. The enzyme twins involved are cyclooxygenase (aspirin) and isocitrate dehydrogenase (IDH). The IDH results are accurate to 1% and are overdetermined by adjusting a single bioinformatic scaling parameter. It appears that the final stage in optimizing protein functionality may involve leveling of the hydrophobic limits of the arms of conformational hydrophilic hinges.

  2. Comparative Analysis of the Macroscale Structural Connectivity in the Macaque and Human Brain

    PubMed Central

    Bezgin, Gleb; Uylings, Harry B. M.; Roebroeck, Alard; Stiers, Peter

    2014-01-01

    The macaque brain serves as a model for the human brain, but its suitability is challenged by unique human features, including connectivity reconfigurations, which emerged during primate evolution. We perform a quantitative comparative analysis of the whole brain macroscale structural connectivity of the two species. Our findings suggest that the human and macaque brain as a whole are similarly wired. A region-wise analysis reveals many interspecies similarities of connectivity patterns, but also lack thereof, primarily involving cingulate regions. We unravel a common structural backbone in both species involving a highly overlapping set of regions. This structural backbone, important for mediating information across the brain, seems to constitute a feature of the primate brain persevering evolution. Our findings illustrate novel evolutionary aspects at the macroscale connectivity level and offer a quantitative translational bridge between macaque and human research. PMID:24676052

  3. Comparative study of vibrations in submonolayer structures of potassium on Pt(111).

    PubMed

    Rusina, G G; Eremeev, S V; Borisova, S D; Chulkov, E V

    2012-03-14

    We present results of a comparative study of the vibrational spectrum and local density of phonon states in ordered p(2 x 2) and (√3 x √3)R30° structures formed by potassium atoms on the Pt(111) surface. The calculations were performed with tight-binding interatomic interaction potentials. It was found that the mode associated with vertical displacements of K adatoms has an energy of about 20 meV in both K structures. The strength and energy of this mode slightly decreases with increasing coverage. This result is in good agreement with available experimental data. As in time-resolved second harmonic generation measurements, we observed low frequency modes for both structures considered, which are caused by the interaction of potassium with the second layer of the substrate.

  4. Comparative study of vibrations in submonolayer structures of potassium on Pt(111)

    NASA Astrophysics Data System (ADS)

    Rusina, G. G.; Eremeev, S. V.; Borisova, S. D.; Chulkov, E. V.

    2012-03-01

    We present results of a comparative study of the vibrational spectrum and local density of phonon states in ordered p(2 × 2) and (\\sqrt{3}\\times \\sqrt{3}){R}30^{\\circ} structures formed by potassium atoms on the Pt(111) surface. The calculations were performed with tight-binding interatomic interaction potentials. It was found that the mode associated with vertical displacements of K adatoms has an energy of about 20 meV in both K structures. The strength and energy of this mode slightly decreases with increasing coverage. This result is in good agreement with available experimental data. As in time-resolved second harmonic generation measurements, we observed low frequency modes for both structures considered, which are caused by the interaction of potassium with the second layer of the substrate.

  5. Comparative 3D genome structure analysis of the fission and the budding yeast.

    PubMed

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species.

  6. Incorporating bioinformatics into biological science education in Nigeria: prospects and challenges.

    PubMed

    Ojo, O O; Omabe, M

    2011-06-01

    The urgency to process and analyze the deluge of data created by proteomics and genomics studies worldwide has caused bioinformatics to gain prominence and importance. However, its multidisciplinary nature has created a unique demand for specialist trained in both biology and computing. Several countries, in response to this challenge, have developed a number of manpower training programmes. This review presents a description of the meaning, scope, history and development of bioinformatics with focus on prospects and challenges facing bioinformatics education worldwide. The paper also provides an overview of attempts at the introduction of bioinformatics in Nigeria; describes the existing bioinformatics scenario in Nigeria and suggests strategies for effective bioinformatics education in Nigeria.

  7. Cloning, expression, and bioinformatics analysis of the sheep CARP gene.

    PubMed

    Ma, Guoda; Wang, Haiyang; Li, You; Cui, Lili; Cui, Yudong; Li, Qingzhang; Li, Keshen; Zhao, Bin

    2013-06-01

    The cardiac ankyrin repeat protein (CARP) is a multifunctional protein that is expressed specifically in mammalian cardiac muscle and plays important roles in stress responses, transcriptional regulation, myofibrillar assembly, and the development of cardiac and skeletal muscle. In this study, the sheep homolog of the CARP gene was cloned and characterized. The coding region of the gene consists of 960 bp and encodes 319 amino acids with molecular weight 36.2 KD. Bioinformatics analysis demonstrated that the 3' untranslated region (3'-UTR) of the gene contains many AU-rich elements that are associated with mRNA stability and a potential regulatory site for miRNA binding. The protein was predicted to contain 14 potential phosphorylation sites and an O-GlcNAc glycosylation site and to be expressed in both the nucleus and cytoplasm. The evolutionary analysis revealed that the sheep CARP exhibited a high level of homology with the mammalian counterparts; however, the protein exhibited an increased evolutionary distance from the chicken, frog, and fish homologs. RT-PCR revealed that in addition to its high mRNA expression level in cardiac muscle, trace amounts of the sheep CARP mRNA were expressed in the skeletal muscle, stomach, and small intestine. However, western blot analysis demonstrated that the CARP protein was expressed only in cardiac muscle. The coding sequence was cloned into the pET30a-TEV-LIC vector, and the soluble CARP-MBP (maltose-binding protein) fusion protein was expressed in a prokaryotic host and purified by affinity chromatography. Our data provide the basis for future studies of the structure and function of sheep CARP.

  8. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    PubMed Central

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  9. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    SciTech Connect

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics into

  10. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc.

  11. Comparing the Topological and Electrical Structure of the North American Electric Power Infrastructure

    NASA Astrophysics Data System (ADS)

    Cotilla-Sanchez, Eduardo; Hines, Paul D. H.; Barrows, Clayton; Blumsack, Seth

    2012-12-01

    The topological (graph) structure of complex networks often provides valuable information about the performance and vulnerability of the network. However, there are multiple ways to represent a given network as a graph. Electric power transmission and distribution networks have a topological structure that is straightforward to represent and analyze as a graph. However, simple graph models neglect the comprehensive connections between components that result from Ohm's and Kirchhoff's laws. This paper describes the structure of the three North American electric power interconnections, from the perspective of both topological and electrical connectivity. We compare the simple topology of these networks with that of random (Erdos and Renyi, 1959), preferential-attachment (Barabasi and Albert, 1999) and small-world (Watts and Strogatz, 1998) networks of equivalent sizes and find that power grids differ substantially from these abstract models in degree distribution, clustering, diameter and assortativity, and thus conclude that these topological forms may be misleading as models of power systems. To study the electrical connectivity of power systems, we propose a new method for representing electrical structure using electrical distances rather than geographic connections. Comparisons of these two representations of the North American power networks reveal notable differences between the electrical and topological structure of electric power networks.

  12. A bioinformatics search pipeline, RNA2DSearch, identifies RNA localization elements in Drosophila retrotransposons.

    PubMed

    Hamilton, Russell S; Hartswood, Eve; Vendra, Georgia; Jones, Cheryl; Van De Bor, Veronique; Finnegan, David; Davis, Ilan

    2009-02-01

    mRNA localization is a widespread mode of delivering proteins to their site of function. The embryonic axes in Drosophila are determined in the oocyte, through Dynein-dependent transport of gurken/TGF-alpha mRNA, containing a small localization signal that assigns its destination. A signal with a similar secondary structure, but lacking significant sequence similarity, is present in the I factor retrotransposon mRNA, also transported by Dynein. It is currently unclear whether other mRNAs exist that are localized to the same site using similar signals. Moreover, searches for other genes containing similar elements have not been possible due to a lack of suitable bioinformatics methods for searches of secondary structure elements and the difficulty of experimentally testing all the possible candidates. We have developed a bioinformatics approach for searching across the genome for small RNA elements that are similar to the secondary structures of particular localization signals. We have uncovered 48 candidates, of which we were able to test 22 for their localization potential using injection assays for Dynein mediated RNA localization. We found that G2 and Jockey transposons each contain a gurken/I factor-like RNA stem-loop required for Dynein-dependent localization to the anterior and dorso-anterior corner of the oocyte. We conclude that I factor, G2, and Jockey are members of a "family" of transposable elements sharing a gurken-like mRNA localization signal and Dynein-dependent mechanism of transport. The bioinformatics pipeline we have developed will have broader utility in fields where small RNA signals play important roles.

  13. Comparative evaluation of structured oil systems: Shellac oleogel, HPMC oleogel, and HIPE gel

    PubMed Central

    Patel, Ashok R; Dewettinck, Koen

    2015-01-01

    In lipid-based food products, fat crystals are used as building blocks for creating a crystalline network that can trap liquid oil into a 3D gel-like structure which in turn is responsible for the desirable mouth feel and texture properties of the food products. However, the recent ban on the use of trans-fat in the US, coupled with the increasing concerns about the negative health effects of saturated fat consumption, has resulted in an increased interest in the area of identifying alternative ways of structuring edible oils using non-fat-based building blocks. In this paper, we give a brief account of three alternative approaches where oil structuring was carried out using wax crystals (shellac), polymer strands (hydrophilic cellulose derivative), and emulsion droplets as structurants. These building blocks resulted in three different types of oleogels that showed distinct rheological properties and temperature functionalities. The three approaches are compared in terms of the preparation process (ease of processing), properties of the formed systems (microstructure, rheological gel strength, temperature response, effect of water incorporation, and thixotropic recovery), functionality, and associated limitations of the structured systems. The comparative evaluation is made such that the new researchers starting their work in the area of oil structuring can use this discussion as a general guideline. Practical applications Various aspects of oil binding for three different building blocks were studied in this work. The practical significance of this study includes (i) information on the preparation process and the concentrations of structuring agents required for efficient gelation and (ii) information on the behavior of oleogels to temperature, applied shear, and presence of water. This information can be very useful for selecting the type of structuring agents keeping the final applications in mind. For detailed information on the actual edible applications

  14. ModBase, a database of annotated comparative protein structure models and associated resources

    PubMed Central

    Pieper, Ursula; Webb, Benjamin M.; Dong, Guang Qiang; Schneidman-Duhovny, Dina; Fan, Hao; Kim, Seung Joong; Khuri, Natalia; Spill, Yannick G.; Weinkam, Patrick; Hammel, Michal; Tainer, John A.; Nilges, Michael; Sali, Andrej

    2014-01-01

    ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains almost 30 million reliable models for domains in 4.7 million unique protein sequences. ModBase allows users to compute or update comparative models on demand, through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the AllosMod server for modeling ligand-induced protein dynamics (http://salilab.org/allosmod), the AllosMod-FoXS server for predicting a structural ensemble that fits an SAXS profile (http://salilab.org/allosmod-foxs), the FoXSDock server for protein–protein docking filtered by an SAXS profile (http://salilab.org/foxsdock), the SAXS Merge server for automatic merging of SAXS profiles (http://salilab.org/saxsmerge) and the Pose & Rank server for scoring protein–ligand complexes (http://salilab.org/poseandrank). In this update, we also highlight two applications of ModBase: a PSI:Biology initiative to maximize the structural coverage of the human alpha-helical transmembrane proteome and a determination of structural determinants of human immunodeficiency virus-1 protease specificity. PMID:24271400

  15. Solvation shell structure of cyclooctylpyranone in water solvent and its comparative structure, dynamics and dipole moment in HIV protease.

    PubMed

    Arul Murugan, N; Chandra Jha, Prakash; Agren, Hans

    2009-08-14

    We have investigated the solvation structure for cyclooctylpyranone (COP) in water solvent using force-field molecular dynamics (MD) and Car-Parrinello mixed quantum mechanics-molecular mechanics (CPMD) calculations. The MD calculations show that in water solvent COP can exist in two conformational states which differ with respect to the relative orientations of the three rings, namely phenyl, pyranone and cyclooctane. We report the existence of strong orientational preference for the water molecule in the first solvation shell and the orientational preference disappears for solvent molecules beyond the first solvation shell. In order to investigate the confinement effect on the structure, dynamics, charge distribution and dipole moment of COP, we have carried out MD and CPMD calculations for COP within HIV type-1 protease (PR). Interestingly, we do not see any conformational transitions for COP within the protein cavity and it remains as a single conformer. We do see a remarkable effect of confinement on few other torsional degrees of freedom such as gg to tg conformational shift for the propyl group of COP. However, the methyl group rotational dynamics remains similar in the water solvent and in the protein environment. Also, within the protein cavity, the COP molecule is more polarized when compared to water solvent. Static ab initio electronic structure calculations were performed on the COP molecule with varying torsional angle in order to investigate the angle dependence of the molecular volume and energy.

  16. An Analytic Comparison of Educational Systems: Overview of Purposes, Policies, Structures and Outcomes. Comparative Overview/Comparative Assessment.

    ERIC Educational Resources Information Center

    Hurn, Christopher J.; Burn, Barbara B.

    This comparative evaluation of the differing educational systems in North America, Europe, the USSR, and Japan examines the goals and values of these systems. It is pointed out that Americans value equality, practicality, and utility and that they are both individualistic and suspicious of government authority. Contrasts between these values and…

  17. Prediction and Analysis of Key Genes in Glioblastoma Based on Bioinformatics

    PubMed Central

    Long, Hao; Liang, Chaofeng; Zhang, Xi'an; Fang, Luxiong; Wang, Gang; Qi, Songtao

    2017-01-01

    Understanding the mechanisms of glioblastoma at the molecular and structural level is not only interesting for basic science but also valuable for biotechnological application, such as the clinical treatment. In the present study, bioinformatics analysis was performed to reveal and identify the key genes of glioblastoma multiforme (GBM). The results obtained in the present study signified the importance of some genes, such as COL3A1, FN1, and MMP9, for glioblastoma. Based on the selected genes, a prediction model was built, which achieved 94.4% prediction accuracy. These findings might provide more insights into the genetic basis of glioblastoma. PMID:28191466

  18. Comparability of a Three-Dimensional Structure in Biopharmaceuticals Using Spectroscopic Methods

    PubMed Central

    Abad-Javier, Mario E.; Romero-Díaz, Alexis J.; Villaseñor-Ortega, Francisco; Pérez, Néstor O.; Flores-Ortiz, Luis F.

    2014-01-01

    Protein structure depends on weak interactions and covalent bonds, like disulfide bridges, established according to the environmental conditions. Here, we present the validation of two spectroscopic methodologies for the measurement of free and unoxidized thiols, as an attribute of structural integrity, using 5,5′-dithionitrobenzoic acid (DTNB) and DyLight Maleimide (DLM) as derivatizing agents. These methods were used to compare Rituximab and Etanercept products from different manufacturers. Physicochemical comparability was demonstrated for Rituximab products as DTNB showed no statistical differences under native, denaturing, and denaturing-reducing conditions, with Student's t-test P values of 0.6233, 0.4022, and 0.1475, respectively. While for Etanercept products no statistical differences were observed under native (P = 0.0758) and denaturing conditions (P = 0.2450), denaturing-reducing conditions revealed cysteine contents of 98% and 101%, towards the theoretical value of 58, for the evaluated products from different Etanercept manufacturers. DLM supported equality between Rituximab products under native (P = 0.7499) and denaturing conditions (P = 0.8027), but showed statistical differences among Etanercept products under native conditions (P < 0.001). DLM suggested that Infinitam has fewer exposed thiols than Enbrel, although DTNB method, circular dichroism (CD), fluorescence (TCSPC), and activity (TNFα neutralization) showed no differences. Overall, this data revealed the capabilities and drawbacks of each thiol quantification technique and their correlation with protein structure. PMID:24963443

  19. Comparability of a three-dimensional structure in biopharmaceuticals using spectroscopic methods.

    PubMed

    Pérez Medina Martínez, Víctor; Abad-Javier, Mario E; Romero-Díaz, Alexis J; Villaseñor-Ortega, Francisco; Pérez, Néstor O; Flores-Ortiz, Luis F; Medina-Rivero, Emilio

    2014-01-01

    Protein structure depends on weak interactions and covalent bonds, like disulfide bridges, established according to the environmental conditions. Here, we present the validation of two spectroscopic methodologies for the measurement of free and unoxidized thiols, as an attribute of structural integrity, using 5,5'-dithionitrobenzoic acid (DTNB) and DyLight Maleimide (DLM) as derivatizing agents. These methods were used to compare Rituximab and Etanercept products from different manufacturers. Physicochemical comparability was demonstrated for Rituximab products as DTNB showed no statistical differences under native, denaturing, and denaturing-reducing conditions, with Student's t-test P values of 0.6233, 0.4022, and 0.1475, respectively. While for Etanercept products no statistical differences were observed under native (P = 0.0758) and denaturing conditions (P = 0.2450), denaturing-reducing conditions revealed cysteine contents of 98% and 101%, towards the theoretical value of 58, for the evaluated products from different Etanercept manufacturers. DLM supported equality between Rituximab products under native (P = 0.7499) and denaturing conditions (P = 0.8027), but showed statistical differences among Etanercept products under native conditions (P < 0.001). DLM suggested that Infinitam has fewer exposed thiols than Enbrel, although DTNB method, circular dichroism (CD), fluorescence (TCSPC), and activity (TNF α neutralization) showed no differences. Overall, this data revealed the capabilities and drawbacks of each thiol quantification technique and their correlation with protein structure.

  20. Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations

    PubMed Central

    Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W.

    2016-01-01

    Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures. PMID:26904094

  1. Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations.

    PubMed

    Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W

    2016-01-01

    Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures.

  2. Hidden in the Middle: Culture, Value and Reward in Bioinformatics

    ERIC Educational Resources Information Center

    Lewis, Jamie; Bartlett, Andrew; Atkinson, Paul

    2016-01-01

    Bioinformatics--the so-called shotgun marriage between biology and computer science--is an interdiscipline. Despite interdisciplinarity being seen as a virtue, for having the capacity to solve complex problems and foster innovation, it has the potential to place projects and people in anomalous categories. For example, valorised…

  3. Bioinformatic approaches to interrogating vitamin D receptor signaling.

    PubMed

    Campbell, Moray J

    2017-03-10

    Bioinformatics applies unbiased approaches to develop statistically-robust insight into health and disease. At the global, or "20,000 foot" view bioinformatic analyses of vitamin D receptor (NR1I1/VDR) signaling can measure where the VDR gene or protein exerts a genome-wide significant impact on biology; VDR is significantly implicated in bone biology and immune systems, but not in cancer. With a more VDR-centric, or "2000 foot" view, bioinformatic approaches can interrogate events downstream of VDR activity. Integrative approaches can combine VDR ChIP-Seq in cell systems where significant volumes of publically available data are available. For example, VDR ChIP-Seq studies can be combined with genome-wide association studies to reveal significant associations to immune phenotypes. Similarly, VDR ChIP-Seq can be combined with data from Cancer Genome Atlas (TCGA) to infer the impact of VDR target genes in cancer progression. Therefore, bioinformatic approaches can reveal what aspects of VDR downstream networks are significantly related to disease or phenotype.

  4. An International Bioinformatics Infrastructure to Underpin the Arabidopsis Community

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The future bioinformatics needs of the Arabidopsis community as well as those of other scientific communities that depend on Arabidopsis resources were discussed at a pair of recent meetings held by the Multinational Arabidopsis Steering Committee (MASC) and the North American Arabidopsis Steering C...

  5. An evaluation of ontology exchange languages for bioinformatics.

    PubMed

    McEntire, R; Karp, P; Abernethy, N; Benton, D; Helt, G; DeJongh, M; Kent, R; Kosky, A; Lewis, S; Hodnett, D; Neumann, E; Olken, F; Pathak, D; Tarczy-Hornoch, P; Toldo, L; Topaloglou, T

    2000-01-01

    Ontologies are specifications of the concepts in a given field, and of the relationships among those concepts. The development of ontologies for molecular-biology information and the sharing of those ontologies within the bioinformatics community are central problems in bioinformatics. If the bioinformatics community is to share ontologies effectively, ontologies must be exchanged in a form that uses standardized syntax and semantics. This paper reports on an effort among the authors to evaluate alternative ontology-exchange languages, and to recommend one or more languages for use within the larger bioinformatics community. The study selected a set of candidate languages, and defined a set of capabilities that the ideal ontology-exchange language should satisfy. The study scored the languages according to the degree to which they satisfied each capability. In addition, the authors performed several ontology-exchange experiments with the two languages that received the highest scores: OML and Ontolingua. The result of those experiments, and the main conclusion of this study, was that the frame-based semantic model of Ontolingua is preferable to the conceptual graph model of OML, but that the XML-based syntax of OML is preferable to the Lisp-based syntax of Ontolingua.

  6. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    ERIC Educational Resources Information Center

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  7. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    ERIC Educational Resources Information Center

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  8. Incorporation of Bioinformatics Exercises into the Undergraduate Biochemistry Curriculum

    ERIC Educational Resources Information Center

    Feig, Andrew L.; Jabri, Evelyn

    2002-01-01

    The field of bioinformatics is developing faster than most biochemistry textbooks can adapt. Supplementing the undergraduate biochemistry curriculum with data-mining exercises is an ideal way to expose the students to the common databases and tools that take advantage of this vast repository of biochemical information. An integrated collection of…

  9. Bioinformatics Education—Perspectives and Challenges out of Africa

    PubMed Central

    Adebiyi, Ezekiel F.; Alzohairy, Ahmed M.; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J.; Panji, Sumir; Patterton, Hugh-G.

    2015-01-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  10. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    EPA Science Inventory

    A Bioinformatic Strategy to Rapidly Characterize cDNA Libraries

    G. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.
    1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  11. Pladipus Enables Universal Distributed Computing in Proteomics Bioinformatics.

    PubMed

    Verheggen, Kenneth; Maddelein, Davy; Hulstaert, Niels; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2016-03-04

    The use of proteomics bioinformatics substantially contributes to an improved understanding of proteomes, but this novel and in-depth knowledge comes at the cost of increased computational complexity. Parallelization across multiple computers, a strategy termed distributed computing, can be used to handle this increased complexity; however, setting up and maintaining a distributed computing infrastructure requires resources and skills that are not readily available to most research groups. Here we propose a free and open-source framework named Pladipus that greatly facilitates the establishment of distributed computing networks for proteomics bioinformatics tools. Pladipus is straightforward to install and operate thanks to its user-friendly graphical interface, allowing complex bioinformatics tasks to be run easily on a network instead of a single computer. As a result, any researcher can benefit from the increased computational efficiency provided by distributed computing, hence empowering them to tackle more complex bioinformatics challenges. Notably, it enables any research group to perform large-scale reprocessing of publicly available proteomics data, thus supporting the scientific community in mining these data for novel discoveries.

  12. Structural violence in long-term, residential care for older people: comparing Canada and Scandinavia.

    PubMed

    Banerjee, Albert; Daly, Tamara; Armstrong, Pat; Szebehely, Marta; Armstrong, Hugh; Lafrance, Stirling

    2012-02-01

    Canadian frontline careworkers are six times more likely to experience daily physical violence than their Scandinavian counterparts. This paper draws on a comparative survey of residential careworkers serving older people across three Canadian provinces (Manitoba, Nova Scotia, Ontario) and four countries that follow a Scandinavian model of social care (Denmark, Finland, Norway, Sweden) conducted between 2005 and 2006. Ninety percent of Canadian frontline careworkers experienced physical violence from residents or their relatives and 43 percent reported physical violence on a daily basis. Canadian focus groups conducted in 2007 reveal violence was often normalized as an inevitable part of elder-care. We use the concept of "structural violence" (Galtung, 1969) to raise questions about the role that systemic and organizational factors play in setting the context for violence. Structural violence refers to indirect forms of violence that are built into social structures and that prevent people from meeting their basic needs or fulfilling their potential. We applied the concept to long-term residential care and found that the poor quality of the working conditions and inadequate levels of support experienced by Canadian careworkers constitute a form of structural violence. Working conditions are detrimental to careworker's physical and mental health, and prevent careworkers from providing the quality of care they are capable of providing and understand to be part of their job. These conditions may also contribute to the physical violence workers experience, and further investigation is warranted.

  13. Comparative metabolomics and structural characterizations illuminate colibactin pathway-dependent small molecules.

    PubMed

    Vizcaino, Maria I; Engel, Philipp; Trautman, Eric; Crawford, Jason M

    2014-07-02

    The gene cluster responsible for synthesis of the unknown molecule "colibactin" has been identified in mutualistic and pathogenic Escherichia coli. The pathway endows its producer with a long-term persistence phenotype in the human bowel, a probiotic activity used in the treatment of ulcerative colitis, and a carcinogenic activity under host inflammatory conditions. To date, functional small molecules from this pathway have not been reported. Here we implemented a comparative metabolomics and targeted structural network analyses approach to identify a catalog of small molecules dependent on the colibactin pathway from the meningitis isolate E. coli IHE3034 and the probiotic E. coli Nissle 1917. The structures of 10 pathway-dependent small molecules are proposed based on structural characterizations and network relationships. The network will provide a roadmap for the structural and functional elucidation of a variety of other small molecules encoded by the pathway. From the characterized small molecule set, in vitro bacterial growth inhibitory and mammalian CNS receptor antagonist activities are presented.

  14. [Casting faults and structural studies on bonded alloys comparing centrifugal castings and vacuum pressure castings].

    PubMed

    Fuchs, P; Küfmann, W

    1978-07-01

    The casting processes in use today such as centrifugal casting and vacuum pressure casting were compared with one another. An effort was made to answer the question whether the occurrence of shrink cavities and the mean diameter of the grain of the alloy is dependent on the method of casting. 80 crowns were made by both processes from the baked alloys Degudent Universal, Degudent N and the trial alloy 4437 of the firm Degusa. Slice sections were examined for macro and micro-porosity and the structural appearance was evaluated by linear analysis. Statistical analysis showed that casting faults and casting structure is independent of the method used and their causes must be found in the conditions of casting and the composition of the alloy.

  15. Bioclipse: an open source workbench for chemo- and bioinformatics

    PubMed Central

    Spjuth, Ola; Helmus, Tobias; Willighagen, Egon L; Kuhn, Stefan; Eklund, Martin; Wagener, Johannes; Murray-Rust, Peter; Steinbeck, Christoph; Wikberg, Jarl ES

    2007-01-01

    Background There is a need for software applications that provide users with a complete and extensible toolkit for chemo- and bioinformatics accessible from a single workbench. Commercial packages are expensive and closed source, hence they do not allow end users to modify algorithms and add custom functionality. Existing open source projects are more focused on providing a framework for integrating existing, separately installed bioinformatics packages, rather than providing user-friendly interfaces. No open source chemoinformatics workbench has previously been published, and no sucessful attempts have been made to integrate chemo- and bioinformatics into a single framework. Results Bioclipse is an advanced workbench for resources in chemo- and bioinformatics, such as molecules, proteins, sequences, spectra, and scripts. It provides 2D-editing, 3D-visualization, file format conversion, calculation of chemical properties, and much more; all fully integrated into a user-friendly desktop application. Editing supports standard functions such as cut and paste, drag and drop, and undo/redo. Bioclipse is written in Java and based on the Eclipse Rich Client Platform with a state-of-the-art plugin architecture. This gives Bioclipse an advantage over other systems as it can easily be extended with functionality in any desired direction. Conclusion Bioclipse is a powerful workbench for bio- and chemoinformatics as well as an advanced integration platform. The rich functionality, intuitive user interface, and powerful plugin architecture make Bioclipse the most advanced and user-friendly open source workbench for chemo- and bioinformatics. Bioclipse is released under Eclipse Public License (EPL), an open source license which sets no constraints on external plugin licensing; it is totally open for both open source plugins as well as commercial ones. Bioclipse is freely available at . PMID:17316423

  16. Hidden in the Middle: Culture, Value and Reward in Bioinformatics.

    PubMed

    Lewis, Jamie; Bartlett, Andrew; Atkinson, Paul

    2016-01-01

    Bioinformatics - the so-called shotgun marriage between biology and computer science - is an interdiscipline. Despite interdisciplinarity being seen as a virtue, for having the capacity to solve complex problems and foster innovation, it has the potential to place projects and people in anomalous categories. For example, valorised 'outputs' in academia are often defined and rewarded by discipline. Bioinformatics, as an interdisciplinary bricolage, incorporates experts from various disciplinary cultures with their own distinct ways of working. Perceived problems of interdisciplinarity include difficulties of making explicit knowledge that is practical, theoretical, or cognitive. But successful interdisciplinary research also depends on an understanding of disciplinary cultures and value systems, often only tacitly understood by members of the communities in question. In bioinformatics, the 'parent' disciplines have different value systems; for example, what is considered worthwhile research by computer scientists can be thought of as trivial by biologists, and vice versa. This paper concentrates on the problems of reward and recognition described by scientists working in academic bioinformatics in the United Kingdom. We highlight problems that are a consequence of its cross-cultural make-up, recognising that the mismatches in knowledge in this borderland take place not just at the level of the practical, theoretical, or epistemological, but also at the cultural level too. The trend in big, interdisciplinary science is towards multiple authors on a single paper; in bioinformatics this has created hybrid or fractional scientists who find they are being positioned not just in-between established disciplines but also in-between as middle authors or, worse still, left off papers altogether.

  17. [Comparative analysis of the genetic structure of Red Polish cattle in Poland and the Ukraine].

    PubMed

    Oblap, R V; Zvezhkhovski, L; Ivanchenko, E V; Glazko, V I

    2002-01-01

    Comparative analysis of genetic structure of two groups of Red Polish cattle, which reproduce in Poland and Ukraine, was made. Six molecular-genetic markers (kappa-casein, beta-lactoglobulin, leptin, myostatin, growth hormone, and pituitary-specific transcription factor Pit-I) were tested by PCR-RFLP. No significant differences between the considered intrabreed groups were found. High frequency of some alleles (Csn kappa B, Blg B, and Gh L) related to the important productivity traits were observed. The rare alleles in some genes were revealed. The obtained results are evidence of the unique characteristics of the investigated breed.

  18. A comparative study of the inner ear structures of artiodactyls and early cetaceans

    SciTech Connect

    Klingshirn, M.A.; Luo, Z.

    1994-12-31

    It has been suggested that the order Cetacea (whales and porpoises) are closely related to artiodactyls, even-hoofed ungulate mammals such as the pig and cow. Paleontological and molecular data strongly supports this concept of phylogenetic relationships. In a study of DNA sequences of two mitochondrial ribosomal gene segments of cetaceans, the artiodactyls were found to be closest related to Cetaceans. These well accepted studies on the phylogenetic affinities of artiodactyls and cetaceans cause us to conduct a comparative study of the bony structure of the inner ear of these two taxa.

  19. Comparing two iteration algorithms of Broyden electron density mixing through an atomic electronic structure computation

    NASA Astrophysics Data System (ADS)

    Man-Hong, Zhang

    2016-05-01

    By performing the electronic structure computation of a Si atom, we compare two iteration algorithms of Broyden electron density mixing in the literature. One was proposed by Johnson and implemented in the well-known VASP code. The other was given by Eyert. We solve the Kohn-Sham equation by using a conventional outward/inward integration of the differential equation and then connect two parts of solutions at the classical turning points, which is different from the method of the matrix eigenvalue solution as used in the VASP code. Compared to Johnson’s algorithm, the one proposed by Eyert needs fewer total iteration numbers. Project supported by the National Natural Science Foundation of China (Grant No. 61176080).

  20. Comparing Treatment Policies with Assistance from the Structural Nested Mean Model

    PubMed Central

    Lu, Xi; Lynch, Kevin G.; Oslin, David W.; Murphy, Susan

    2015-01-01

    Summary Treatment policies, also known as dynamic treatment regimes, are sequences of decision rules that link the observed patient history with treatment recommendations. Multiple, plausible, treatment policies are frequently constructed by researchers using expert opinion, theories and reviews of the literature. Often these different policies represent competing approaches to managing an illness. Here we develop an “assisted estimator” that can be used to compare the mean outcome of competing treatment policies. The term “assisted” refers to the fact estimators from the Structural Nested Mean Model, a parametric model for the causal effect of treatment at each time point, are used in the process of estimating the mean outcome. This work is motivated by our work on comparing the mean outcome of two competing treatment policies using data from the ExTENd study in alcohol dependence. PMID:26363892

  1. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    ERIC Educational Resources Information Center

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  2. Bioinformatics training: selecting an appropriate learning content management system--an example from the European Bioinformatics Institute.

    PubMed

    Wright, Victoria Ann; Vaughan, Brendan W; Laurent, Thomas; Lopez, Rodrigo; Brooksbank, Cath; Schneider, Maria Victoria

    2010-11-01

    Today's molecular life scientists are well educated in the emerging experimental tools of their trade, but when it comes to training on the myriad of resources and tools for dealing with biological data, a less ideal situation emerges. Often bioinformatics users receive no formal training on how to make the most of the bioinformatics resources and tools available in the public domain. The European Bioinformatics Institute, which is part of the European Molecular Biology Laboratory (EMBL-EBI), holds the world's most comprehensive collection of molecular data, and training the research community to exploit this information is embedded in the EBI's mission. We have evaluated eLearning, in parallel with face-to-face courses, as a means of training users of our data resources and tools. We anticipate that eLearning will become an increasingly important vehicle for delivering training to our growing user base, so we have undertaken an extensive review of Learning Content Management Systems (LCMSs). Here, we describe the process that we used, which considered the requirements of trainees, trainers and systems administrators, as well as taking into account our organizational values and needs. This review describes the literature survey, user discussions and scripted platform testing that we performed to narrow down our choice of platform from 36 to a single platform. We hope that it will serve as guidance for others who are seeking to incorporate eLearning into their bioinformatics training programmes.

  3. Comparative Analysis of Data Structures for Storing Massive Tins in a Dbms

    NASA Astrophysics Data System (ADS)

    Kumar, K.; Ledoux, H.; Stoter, J.

    2016-06-01

    Point cloud data are an important source for 3D geoinformation. Modern day 3D data acquisition and processing techniques such as airborne laser scanning and multi-beam echosounding generate billions of 3D points for simply an area of few square kilometers. With the size of the point clouds exceeding the billion mark for even a small area, there is a need for their efficient storage and management. These point clouds are sometimes associated with attributes and constraints as well. Storing billions of 3D points is currently possible which is confirmed by the initial implementations in Oracle Spatial SDO PC and the PostgreSQL Point Cloud extension. But to be able to analyse and extract useful information from point clouds, we need more than just points i.e. we require the surface defined by these points in space. There are different ways to represent surfaces in GIS including grids, TINs, boundary representations, etc. In this study, we investigate the database solutions for the storage and management of massive TINs. The classical (face and edge based) and compact (star based) data structures are discussed at length with reference to their structure, advantages and limitations in handling massive triangulations and are compared with the current solution of PostGIS Simple Feature. The main test dataset is the TIN generated from third national elevation model of the Netherlands (AHN3) with a point density of over 10 points/m2. PostgreSQL/PostGIS DBMS is used for storing the generated TIN. The data structures are tested with the generated TIN models to account for their geometry, topology, storage, indexing, and loading time in a database. Our study is useful in identifying what are the limitations of the existing data structures for storing massive TINs and what is required to optimise these structures for managing massive triangulations in a database.

  4. Structural and compositional changes in erythrocyte membrane of obese compared to normal-weight adolescents.

    PubMed

    Perona, Javier S; González-Jiménez, Emilio; Aguilar-Cordero, María J; Sureda, Antonio; Barceló, Francisca

    2013-12-01

    Unhealthy dietary habits are key determinants of obesity in adolescents. Assuming that dietary fat profile influences membrane lipid composition, the aim of this study was to analyze structural changes in the erythrocyte membrane of obese compared to normal-weight adolescents. The study was conducted in a group of 11 obese and 11 normal-weight adolescent subjects. The lipid profile, lipid peroxidation and acetylcholinesterase enzyme (AChE) activity were analyzed by conventional methods. The structural properties of reconstituted erythrocyte membrane were characterized by X-ray diffraction. Erythrocyte membrane from obese adolescents had a lipid profile characterized by a higher cholesterol/phospholipid ratio, an increase in saturated fatty acid and a decrease in monounsaturated and n-6 polyunsaturated fatty acid concentrations. Differences in lipid content were associated with changes in the structural properties of reconstituted membranes and the oxidative damage of erythrocyte membrane. The lower oxidative level shown in the obese group (0.15 ± 0.04 vs. 0.20 ± 0.06 nmol/mg for conjugated diene concentrations and 2.43 ± 0.25 vs. 2.83 ± 0.31 nmol/mg protein for malondialdehyde levels) was related to a lower unsaturation index. These changes in membrane structural properties were accompanied by a lower AChE activity (1.64 ± 0.13 vs. 1.91 ± 0.24 nmol AChE/[min mg protein]) in the obese group. The consequences of unhealthy dietary habits in adolescents are reflected in the membrane structural properties and may influence membrane-associated protein activities and functions.

  5. Design and Implementation of an Interdepartmental Bioinformatics Program across Life Science Curricula

    ERIC Educational Resources Information Center

    Miskowski, Jennifer A.; Howard, David R.; Abler, Michael L.; Grunwald, Sandra K.

    2007-01-01

    Over the past 10 years, there has been a technical revolution in the life sciences leading to the emergence of a new discipline called bioinformatics. In response, bioinformatics-related topics have been incorporated into various undergraduate courses along with the development of new courses solely focused on bioinformatics. This report describes…

  6. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    ERIC Educational Resources Information Center

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  7. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    ERIC Educational Resources Information Center

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  8. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    ERIC Educational Resources Information Center

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  9. Comparative proteomics reveal fundamental structural and functional differences between the two progeny phenotypes of a baculovirus.

    PubMed

    Hou, Dianhai; Zhang, Leike; Deng, Fei; Fang, Wei; Wang, Ranran; Liu, Xijia; Guo, Lin; Rayner, Simon; Chen, Xinwen; Wang, Hualin; Hu, Zhihong

    2013-01-01

    The replication of lepidopteran baculoviruses is characterized by the production of two progeny phenotypes: the occlusion-derived virus (ODV), which establishes infection in midgut cells, and the budded virus (BV), which disseminates infection to different tissues within a susceptible host. To understand the structural, and hence functional, differences between BV and ODV, we employed multiple proteomic methods to reveal the protein compositions and posttranslational modifications of the two phenotypes of Helicoverpa armigera nucleopolyhedrovirus. In addition, Western blotting and quantitative mass spectrometry were used to identify the localization of proteins in the envelope or nucleocapsid fractions. Comparative protein portfolios of BV and ODV showing the distribution of 54 proteins, encompassing the 21 proteins shared by BV and ODV, the 12 BV-specific proteins, and the 21 ODV-specific proteins, were obtained. Among the 11 ODV-specific envelope proteins, 8 either are essential for or contribute to oral infection. Twenty-three phosphorylated and 6 N-glycosylated viral proteins were also identified. While the proteins that are shared by the two phenotypes appear to be important for nucleocapsid assembly and trafficking, the structural and functional differences between the two phenotypes are evidently characterized by the envelope proteins and posttranslational modifications. This comparative proteomics study provides new insight into how BV and ODV are formed and why they function differently.

  10. Comparative study of local structure of two cyanobiphenyl liquid crystals by molecular dynamics method

    SciTech Connect

    Gerts, Egor D. Komolkin, Andrei V.; Burmistrov, Vladimir A.; Alexandriysky, Victor V.; Dvinskikh, Sergey V.

    2014-08-21

    Fully-atomistic molecular dynamics simulations were carried out on two similar cyanobiphenyl nematogens, HO-6OCB and 7OCB, in order to study effects of hydrogen bonds on local structure of liquid crystals. Comparable length of these two molecules provides more evident results on the effects of hydrogen bonding. The analysis of radial and cylindrical distribution functions clearly shows the differences in local structure of two mesogens. The simulations showed that anti-parallel alignment is preferable for the HO-6OCB. Hydrogen bonds between OH-groups are observed for 51% of HO-6OCB molecules, while hydrogen bonding between CN- and OH-groups occurs only for 16% of molecules. The lifetimes of H-bonds differ due to different mobility of molecular fragments (50 ps for N⋅⋅⋅H–O and 41 ps for O⋅⋅⋅H–O). Although the standard Optimized Potentials for Liquid Simulations - All-Atom force field cannot reproduce some experimental parameters quantitatively (order parameters are overestimated, diffusion coefficients are not reproduced well), the comparison of relative simulated results for the pair of mesogens is nevertheless consistent with the same relative experimental parameters. Thus, the comparative study of simulated and experimental results for the pair of similar liquid crystals still can be assumed plausible.

  11. Physcomitrella HMGA-type proteins display structural differences compared to their higher plant counterparts

    SciTech Connect

    Lyngaard, Carina; Stemmer, Christian; Stensballe, Allan; Graf, Manuela; Gorr, Gilbert; Decker, Eva; Grasser, Klaus D.

    2008-10-03

    High mobility group (HMG) proteins of the HMGA family are chromatin-associated proteins that act as architectural factors in nucleoprotein structures involved in gene transcription. To date, HMGA-type proteins have been studied in various higher plant species, but not in lower plants. We have identified two HMGA-type proteins, HMGA1 and HMGA2, encoded in the genome of the moss model Physcomitrella patens. Compared to higher plant HMGA proteins, the two Physcomitrella proteins display some structural differences. Thus, the moss HMGA proteins have six (rather than four) AT-hook DNA-binding motifs and their N-terminal domain lacks similarity to linker histone H1. HMGA2 is expressed in moss protonema and it localises to the cell nucleus. Typical of HMGA proteins, HMGA2 interacts preferentially with A/T-rich DNA, when compared with G/C-rich DNA. In cotransformation assays in Physcomitrella protoplasts, HMGA2 stimulated reporter gene expression. In summary, our data show that functional HMGA-type proteins occur in Physcomitrella.

  12. Allelic genome structural variations in maize detected by array comparative genome hybridization.

    PubMed

    Beló, André; Beatty, Mary K; Hondred, David; Fengler, Kevin A; Li, Bailin; Rafalski, Antoni

    2010-01-01

    DNA polymorphisms such as insertion/deletions and duplications affecting genome segments larger than 1 kb are known as copy-number variations (CNVs) or structural variations (SVs). They have been recently studied in animals and humans by using array-comparative genome hybridization (aCGH), and have been associated with several human diseases. Their presence and phenotypic effects in plants have not been investigated on a genomic scale, although individual structural variations affecting traits have been described. We used aCGH to investigate the presence of CNVs in maize by comparing the genome of 13 maize inbred lines to B73. Analysis of hybridization signal ratios of 60,472 60-mer oligonucleotide probes between inbreds in relation to their location in the reference genome (B73) allowed us to identify clusters of probes that deviated from the ratio expected for equal copy-numbers. We found CNVs distributed along the maize genome in all chromosome arms. They occur with appreciable frequency in different germplasm subgroups, suggesting ancient origin. Validation of several CNV regions showed both insertion/deletions and copy-number differences. The nature of CNVs detected suggests CNVs might have a considerable impact on plant phenotypes, including disease response and heterosis.

  13. Gender differences in structured risk assessment: comparing the accuracy of five instruments.

    PubMed

    Coid, Jeremy; Yang, Min; Ullrich, Simone; Zhang, Tianqiang; Sizmur, Steve; Roberts, Colin; Farrington, David P; Rogers, Robert D

    2009-04-01

    Structured risk assessment should guide clinical risk management, but it is uncertain which instrument has the highest predictive accuracy among men and women. In the present study, the authors compared the Psychopathy Checklist-Revised (PCL-R; R. D. Hare, 1991, 2003); the Historical, Clinical, Risk Management-20 (HCR-20; C. D. Webster, K. S. Douglas, D. Eaves, & S. D. Hart, 1997); the Risk Matrix 2000-Violence (RM2000[V]; D. Thornton et al., 2003); the Violence Risk Appraisal Guide (VRAG; V. L. Quinsey, G. T. Harris, M. E. Rice, & C. A. Cormier, 1998); the Offenders Group Reconviction Scale (OGRS; J. B. Copas & P. Marshall, 1998; R. Taylor, 1999); and the total previous convictions among prisoners, prospectively assessed prerelease. The authors compared predischarge measures with subsequent offending and instruments ranked using multivariate regression. Most instruments demonstrated significant but moderate predictive ability. The OGRS ranked highest for violence among men, and the PCL-R and HCR-20 H subscale ranked highest for violence among women. The OGRS and total previous acquisitive convictions demonstrated greatest accuracy in predicting acquisitive offending among men and women. Actuarial instruments requiring no training to administer performed as well as personality assessment and structured risk assessment and were superior among men for violence.

  14. Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool

    PubMed Central

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N.; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. PMID:25742008

  15. Synthesis, structural and spectroscopic studies of two new benzimidazole derivatives: A comparative study

    NASA Astrophysics Data System (ADS)

    Saral, Hasan; Özdamar, Özgür; Uçar, İbrahim

    2017-02-01

    In the present work, structural and spectroscopic studies on 1-Methyl-2-(2‧-hydroxy-4‧-chlorophenyl)benzimidazole (1) and 1-Methyl-2-(2‧-hydroxy-4‧-methoxyphenyl)benzimidazole (2), have been carried out extensively by X-ray diffraction, HRMS, UV-Vis, FT-IR and 1H and 13C NMR spectroscopy. The crystal structure of both compounds is stabilized by Osbnd H⋯N hydrogen bond and π-π interactions. Contrary to compound 1, the skeleton of compound 2 is considerably deviated from the planarity probably caused by intermolecular hydrogen bonding. The experimental results were compared to the theoretical ones, obtained at DFT level. Ground state geometry, electronic structure, vibrational and NMR spectra have been performed using the B3LYP functional with the 6-31 G(d,p) basis set. It was observed that the bond distances and angles in the both compounds were in good with those of the experiment. The energetic behaviors of the both compounds in methanol solvent were examined using by time-dependent DFT (TD-DFT) method by applying the polarizable continuum model (PCM). Isotropic chemical shifts (13C and 1H NMR) were calculated using the gauge-invariant atomic orbital (GIAO) method. The HOMO and LUMO analyses were used to elucidate information regarding charge transfer within the molecule.

  16. Comparative Evaluation for Brain Structural Connectivity Approaches: Towards Integrative Neuroinformatics Tool for Epilepsy Clinical Research

    PubMed Central

    Yang, Sheng; Tatsuoka, Curtis; Ghosh, Kaushik; Lacuey-Lecumberri, Nuria; Lhatoo, Samden D.; Sahoo, Satya S.

    2016-01-01

    Recent advances in brain fiber tractography algorithms and diffusion Magnetic Resonance Imaging (MRI) data collection techniques are providing new approaches to study brain white matter connectivity, which play an important role in complex neurological disorders such as epilepsy. Epilepsy affects approximately 50 million persons worldwide and it is often described as a disorder of the cortical network organization. There is growing recognition of the need to better understand the role of brain structural networks in the onset and propagation of seizures in epilepsy using high resolution non-invasive imaging technologies. In this paper, we perform a comparative evaluation of two techniques to compute structural connectivity, namely probabilistic fiber tractography and statistics derived from fractional anisotropy (FA), using diffusion MRI data from a patient with rare case of medically intractable insular epilepsy. The results of our evaluation demonstrate that probabilistic fiber tractography provides a more accurate map of structural connectivity and may help address inherent complexities of neural fiber layout in the brain, such as fiber crossings. This work provides an initial result towards building an integrative informatics tool for neuroscience that can be used to accurately characterize the role of fiber tract connectivity in neurological disorders such as epilepsy. PMID:27570685

  17. Comparing nonlinear texture measures for quantifying trabecular bone structures using surrogates

    NASA Astrophysics Data System (ADS)

    Rath, Christoph W.; Monetti, Roberto A.; Muller, Dirk; Bohm, Holger; Rummeny, Ernst J.; Link, Thomas M.

    2004-05-01

    We generalize the methods of constrained randomization in order to assess different nonlinear texture measures for the quantitative characterisation of trabecular bone structures as seen in high resolution MR images of the distal radius for patients with and without osteoporotic bone fractures. We demonstrate that it is feasible to produce surrogates which preserve texture measures sensitive to higher-order correlations. Specifically, we preserve for two-dimensional images the three Minkowski functionals (MF) which can be interpreted as the surface, the perimeter and the Euler-Characteristic of an excursion set. The surrogates preserving the MF's are generated by using simulated annealing techniques, where the constraints are specified in terms of a cost function which has a global minimum when the constraints are fulfilled. The cost function has to be minimized among all permutations of the image pixels. The surrogates and the original data are quantified by estimating their local scaling properties by means of the calculation of the spectrum of weighted scaling indices (WSI). It is shown that a significant discrimination between original and surrogate data is made possible by comparing the probability distributions of the weighted scaling indices. This proves that the two nonlinear texture measures (MF an WSI) are complementary since they are sensitive to different morphological aspects of the trabecular bone structures. It turns out that the generalized method of constrained randomization is a vital tool for assessing the quality of texture measures in terms of sensitivity to images structures and discrimination power.

  18. Comparative study of particle structure evolution during water sorption: skim and whole milk powders.

    PubMed

    Murrieta-Pazos, I; Gaiani, C; Galet, L; Cuq, B; Desobry, S; Scher, J

    2011-10-01

    Surface composition of dairy powders influences significantly a quantity of functional properties such as rehydration, caking, agglomeration. Nevertheless, the kinetic of water uptake by the powders was never directly related to the structure and the composition of the surface. In this work, the effect of relative humidity on the structural reorganization of two types of dairy powder was studied. The water-powder interaction for industrial whole milk powder, and skim milk powder was studied using dynamic vapor sorption. The water sorption isotherms were fitted with a Brunner-Emmet-Teller model and each stage of the sorption curve was analyzed with a Fickian diffusion. The water content in the monolayer predicted for each powder and the moisture diffusivity calculated were discussed and compared. Concurrently, powders microstructure and powders surface under variable relative humidity were assessed by X-ray photoelectron spectroscopy, scanning electron microscopy coupled with energy dispersive X-ray and atomic force microscopy. A correlation between the data obtained from the sorption isotherms and the modifications of structure allowed us to conclude that powder microstructure and chemical state of the components could play an important role in determining the water diffusivity.

  19. Comparative Structural and Functional Analysis of Bunyavirus and Arenavirus Cap-Snatching Endonucleases

    PubMed Central

    Reguera, Juan; Gerlach, Piotr; Rosenthal, Maria; Gaudon, Stephanie; Coscia, Francesca; Günther, Stephan; Cusack, Stephen

    2016-01-01

    Segmented negative strand RNA viruses of the arena-, bunya- and orthomyxovirus families uniquely carry out viral mRNA transcription by the cap-snatching mechanism. This involves cleavage of host mRNAs close to their capped 5′ end by an endonuclease (EN) domain located in the N-terminal region of the viral polymerase. We present the structure of the cap-snatching EN of Hantaan virus, a bunyavirus belonging to hantavirus genus. Hantaan EN has an active site configuration, including a metal co-ordinating histidine, and nuclease activity similar to the previously reported La Crosse virus and Influenza virus ENs (orthobunyavirus and orthomyxovirus respectively), but is more active in cleaving a double stranded RNA substrate. In contrast, Lassa arenavirus EN has only acidic metal co-ordinating residues. We present three high resolution structures of Lassa virus EN with different bound ion configurations and show in comparative biophysical and biochemical experiments with Hantaan, La Crosse and influenza ENs that the isolated Lassa EN is essentially inactive. The results are discussed in the light of EN activation mechanisms revealed by recent structures of full-length influenza virus polymerase. PMID:27304209

  20. Thermodynamic and structural insights into nanocomposites engineering by comparing two materials assembly techniques for graphene.

    PubMed

    Zhu, Jian; Zhang, Huanan; Kotov, Nicholas A

    2013-06-25

    Materials assembled by layer-by-layer (LBL) assembly and vacuum-assisted flocculation (VAF) have similarities, but a systematic study of their comparative advantages and disadvantages is missing. Such a study is needed from both practical and fundamental perspectives aiming at a better understanding of structure-property relationships of nanocomposites and purposeful engineering of materials with unique properties. Layered composites from polyvinyl alcohol (PVA) and reduced graphene (RG) are made by both techniques. We comparatively evaluate their structure, mechanical, and electrical properties. LBL and VAF composites demonstrate clear differences at atomic and nanoscale structural levels but reveal similarities in micrometer and submicrometer organization. Epitaxial crystallization and suppression of phase transition temperatures are more pronounced for PVA in LBL than for VAF composites. Mechanical properties are virtually identical for both assemblies at high RG contents. We conclude that mechanical properties in layered RG assemblies are largely determined by the thermodynamic state of PVA at the polymer/nanosheet interface rather than the nanometer scale differences in RG packing. High and nearly identical values of toughness for LBL and VAF composites reaching 6.1 MJ/m(3) observed for thermodynamically optimal composition confirm this conclusion. Their toughness is the highest among all other layered assemblies from RG, cellulose, clay, etc. Electrical conductivity, however, is more than 10× higher for LBL than for VAF composites for the same RG contents. Electrical properties are largely determined by the tunneling barrier between RG sheets and therefore strongly dependent on atomic/nanoscale organization. These findings open the door for application-oriented methods of materials engineering using both types of layered assemblies.

  1. The Roots of Bioinformatics in Theoretical Biology

    PubMed Central

    Hogeweg, Paulien

    2011-01-01

    From the late 1980s onward, the term “bioinformatics” mostly has been used to refer to computational methods for comparative analysis of genome data. However, the term was originally more widely defined as the study of informatic processes in biotic systems. In this essay, I will trace this early history (from a personal point of view) and I will argue that the original meaning of the term is re-emerging. PMID:21483479

  2. Productivity and salinity structuring of the microplankton revealed by comparative freshwater metagenomics.

    PubMed

    Eiler, Alexander; Zaremba-Niedzwiedzka, Katarzyna; Martínez-García, Manuel; McMahon, Katherine D; Stepanauskas, Ramunas; Andersson, Siv G E; Bertilsson, Stefan

    2014-09-01

    Little is known about the diversity and structuring of freshwater microbial communities beyond the patterns revealed by tracing their distribution in the landscape with common taxonomic markers such as the ribosomal RNA. To address this gap in knowledge, metagenomes from temperate lakes were compared to selected marine metagenomes. Taxonomic analyses of rRNA genes in these freshwater metagenomes confirm the previously reported dominance of a limited subset of uncultured lineages of freshwater bacteria, whereas Archaea were rare. Diversification into marine and freshwater microbial lineages was also reflected in phylogenies of functional genes, and there were also significant differences in functional beta-diversity. The pathways and functions that accounted for these differences are involved in osmoregulation, active transport, carbohydrate and amino acid metabolism. Moreover, predicted genes orthologous to active transporters and recalcitrant organic matter degradation were more common in microbial genomes from oligotrophic versus eutrophic lakes. This comparative metagenomic analysis allowed us to formulate a general hypothesis that oceanic- compared with freshwater-dwelling microorganisms, invest more in metabolism of amino acids and that strategies of carbohydrate metabolism differ significantly between marine and freshwater microbial communities.

  3. Comparing standardized coefficients in structural equation modeling: a model reparameterization approach.

    PubMed

    Kwan, Joyce L Y; Chan, Wai

    2011-09-01

    We propose a two-stage method for comparing standardized coefficients in structural equation modeling (SEM). At stage 1, we transform the original model of interest into the standardized model by model reparameterization, so that the model parameters appearing in the standardized model are equivalent to the standardized parameters of the original model. At stage 2, we impose appropriate linear equality constraints on the standardized model and use a likelihood ratio test to make statistical inferences about the equality of standardized coefficients. Unlike other existing methods for comparing standardized coefficients, the proposed method does not require specific modeling features (e.g., specification of nonlinear constraints), which are available only in certain SEM software programs. Moreover, this method allows researchers to compare two or more standardized coefficients simultaneously in a standard and convenient way. Three real examples are given to illustrate the proposed method, using EQS, a popular SEM software program. Results show that the proposed method performs satisfactorily for testing the equality of standardized coefficients.

  4. The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics

    PubMed Central

    2013-01-01

    User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users’ requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios. For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature. We employed several UCD techniques, including: persona development, interviews, ‘canvas sort’ card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience. PMID:23514033

  5. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    NASA Astrophysics Data System (ADS)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  6. Cluster Flow: A user-friendly bioinformatics workflow tool

    PubMed Central

    Ewels, Philip; Krueger, Felix; Käller, Max; Andrews, Simon

    2016-01-01

    Pipeline tools are becoming increasingly important within the field of bioinformatics. Using a pipeline manager to manage and run workflows comprised of multiple tools reduces workload and makes analysis results more reproducible. Existing tools require significant work to install and get running, typically needing pipeline scripts to be written from scratch before running any analysis. We present Cluster Flow, a simple and flexible bioinformatics pipeline tool designed to be quick and easy to install. Cluster Flow comes with 40 modules for common NGS processing steps, ready to work out of the box. Pipelines are assembled using these modules with a simple syntax that can be easily modified as required. Core helper functions automate many common NGS procedures, making running pipelines simple. Cluster Flow is available with an GNU GPLv3 license on GitHub. Documentation, examples and an online demo are available at http://clusterflow.io.

  7. A Bioinformatics approach to designing a Zika virus vaccine.

    PubMed

    Dey, Sumanta; Nandy, Ashesh; Basak, Subhash C; Nandy, Papiya; Das, Sukhen

    2017-03-10

    The Zika virus infections have reached epidemic proportions in the Latin American countries causing severe birth defects and neurological disorders. While several organizations have begun research into design of prophylactic vaccines and therapeutic drugs, computer assisted methods with adequate data resources can be expected to assist in these measures to reduce lead times through bioinformatics approaches. Using 60 sequences of the Zika virus envelope protein available in the GenBank database, our analysis with numerical characterization techniques and several web based bioinformatics servers identified four peptide stretches on the Zika virus envelope protein that are well conserved and surface exposed and are predicted to have reasonable epitope binding efficiency. These peptides can be expected to form the basis for a nascent peptide vaccine which, enhanced by incorporation of suitable adjuvants, can elicit immune response against the Zika virus infections.

  8. A review of estimation of distribution algorithms in bioinformatics

    PubMed Central

    Armañanzas, Rubén; Inza, Iñaki; Santana, Roberto; Saeys, Yvan; Flores, Jose Luis; Lozano, Jose Antonio; Peer, Yves Van de; Blanco, Rosa; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro

    2008-01-01

    Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. PMID:18822112

  9. REVIEW-ARTICLE Bioinformatics: an overview and its applications.

    PubMed

    Diniz, W J S; Canduri, F

    2017-03-15

    Technological advancements in recent years have promoted a marked progress in understanding the genetic basis of phenotypes. In line with these advances, genomics has changed the paradigm of biological questions in full genome-wide scale (genome-wide), revealing an explosion of data and opening up many possibilities. On the other hand, the vast amount of information that has been generated points the challenges that must be overcome for storage (Moore's law) and processing of biological information. In this context, bioinformatics and computational biology have sought to overcome such challenges. This review presents an overview of bioinformatics and its use in the analysis of biological data, exploring approaches, emerging methodologies, and tools that can give biological meaning to the data generated.

  10. Meeting Review: 2002 O'Reilly Bioinformatics Technology Conference

    PubMed Central

    2002-01-01

    At the end of January I travelled to the States to speak at and attend the first O’Reilly Bioinformatics Technology Conference [14]. It was a large, well-organized and diverse meeting with an interesting history. Although the meeting was not a typical academic conference, its style will, I am sure, become more typical of meetings in both biological and computational sciences. Speakers at the event included prominent bioinformatics researchers such as Ewan Birney, Terry Gaasterland and Lincoln Stein; authors and leaders in the open source programming community like Damian Conway and Nat Torkington; and representatives from several publishing companies including the Nature Publishing Group, Current Science Group and the President of O’Reilly himself, Tim O’Reilly. There were presentations, tutorials, debates, quizzes and even a ‘jam session’ for musical bioinformaticists. PMID:18628852

  11. Some statistics in bioinformatics: the fifth Armitage Lecture.

    PubMed

    Solomon, Patricia J

    2009-10-15

    The spirit and content of the 2007 Armitage Lecture are presented in this paper. To begin, two areas of Peter Armitage's early work are distinguished: his pioneering research on sequential methods intended for use in medical trials and the comparison of survival curves. Their influence on much later work is highlighted, and motivate the proposal of several statistical 'truths' that are presented in the paper. The illustration of these truths demonstrates biology's new morphology and its dominance over statistics in this century. An overview of a recent proteomics ovarian cancer study is given as a warning of what can happen when bioinformatics meets epidemiology badly, in particular, when the study design is poor. A statistical bioinformatics success story is outlined, in which gene profiling is helping to identify novel genes and networks involved in mouse embryonic stem cell development. Some concluding thoughts are given.

  12. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  13. Cluster Flow: A user-friendly bioinformatics workflow tool.

    PubMed

    Ewels, Philip; Krueger, Felix; Käller, Max; Andrews, Simon

    2016-01-01

    Pipeline tools are becoming increasingly important within the field of bioinformatics. Using a pipeline manager to manage and run workflows comprised of multiple tools reduces workload and makes analysis results more reproducible. Existing tools require significant work to install and get running, typically needing pipeline scripts to be written from scratch before running any analysis. We present Cluster Flow, a simple and flexible bioinformatics pipeline tool designed to be quick and easy to install. Cluster Flow comes with 40 modules for common NGS processing steps, ready to work out of the box. Pipelines are assembled using these modules with a simple syntax that can be easily modified as required. Core helper functions automate many common NGS procedures, making running pipelines simple. Cluster Flow is available with an GNU GPLv3 license on GitHub. Documentation, examples and an online demo are available at http://clusterflow.io.

  14. A survey on evolutionary algorithm based hybrid intelligence in bioinformatics.

    PubMed

    Li, Shan; Kang, Liying; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks.

  15. State of the nation in data integration for bioinformatics.

    PubMed

    Goble, Carole; Stevens, Robert

    2008-10-01

    Data integration is a perennial issue in bioinformatics, with many systems being developed and many technologies offered as a panacea for its resolution. The fact that it is still a problem indicates a persistence of underlying issues. Progress has been made, but we should ask "what lessons have been learnt?", and "what still needs to be done?" Semantic Web and Web 2.0 technologies are the latest to find traction within bioinformatics data integration. Now we can ask whether the Semantic Web, mashups, or their combination, have the potential to help. This paper is based on the opening invited talk by Carole Goble given at the Health Care and Life Sciences Data Integration for the Semantic Web Workshop collocated with WWW2007. The paper expands on that talk. We attempt to place some perspective on past efforts, highlight the reasons for success and failure, and indicate some pointers to the future.

  16. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    PubMed Central

    Li, Shan; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks. PMID:24729969

  17. Rise and Demise of Bioinformatics? Promise and Progress

    PubMed Central

    Ouzounis, Christos A.

    2012-01-01

    The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself. PMID:22570600

  18. Towards Structuring Unstructured GenBank Metadata for Enhancing Comparative Biological Studies.

    PubMed

    Chen, Elizabeth S; Sarkar, Indra Neil

    2011-01-01

    Within large sequence repositories such as GenBank there is a wealth of metadata providing contextual information that may enhance search and retrieval of relevant sequences for a range of subsequent analyses. One challenge is the use of free-text in these metadata fields where approaches are needed to extract, structure, and encode essential information. The goal of the present study was to explore the feasibility of using a combination of existing resources for annotating unstructured GenBank metadata, initially focusing on the "host" and "isolation_source" fields. This paper summarizes early results for 10 host organisms that include a characterization of associated isolation sources with respect to biomedical ontologies and semantic types. The findings from this preliminary study provide insights to the rich amount of information captured within these unstructured metadata, guidance for addressing the challenges and issues encountered, and highlight the potential value for enriching comparative biological studies towards improving human health.

  19. Estimating, testing, and comparing specific effects in structural equation models: the phantom model approach.

    PubMed

    Macho, Siegfried; Ledermann, Thomas

    2011-03-01

    The phantom model approach for estimating, testing, and comparing specific effects within structural equation models (SEMs) is presented. The rationale underlying this novel method consists in representing the specific effect to be assessed as a total effect within a separate latent variable model, the phantom model that is added to the main model. The following favorable features characterize the method: (a) It enables the estimation, testing, and comparison of arbitrary specific effects for recursive and nonrecursive models with latent and manifest variables; (b) it enables the bootstrapping of confidence intervals; and (c) it can be applied with all standard SEM programs permitting latent variables, the specification of equality constraints, and the bootstrapping of total effects. These features along with the fact that no manipulation of matrices and formulas is required make the approach particularly suitable for applied researchers. The method is illustrated by means of 3 examples with real data sets.

  20. Comparative toxicity and structure-activity in Chlorella and Tetrahymena: Monosubstituted phenols

    SciTech Connect

    Jaworska, J.S.; Schultz, T.W. )

    1991-07-01

    The relative toxicity of selected monosubstituted phenols has been assessed by Kramer and Truemper in the Chlorella vulgaris assay. The authors examined population growth inhibition of this simple green algae under short-term static conditions for 33 derivatives. However, efforts to develop a strong predictive quantitative structure-activity relationship (QSAR) met with limited success because they modeled across modes of toxic action or segregated derivatives such as positional isomers (i.e., ortho-, meta-, para-). In an effort to further their understanding of the relationships of ecotoxic effects of phenols, the authors have evaluated the same derivatives reported by Kramer and Truemper in the Tetrahymena pyriformis population growth assay, compared the responses in both systems and developed QSARs for the Chlorella vulgaris data based on mechanisms of action.

  1. Genetic Markers and Quantitative Genetic Variation in Medicago Truncatula (Leguminosae): A Comparative Analysis of Population Structure

    PubMed Central

    Bonnin, I.; Prosperi, J. M.; Olivieri, I.

    1996-01-01

    Two populations of the selfing annual Medicago truncatula Gaertn. (Leguminoseae), each subdivided into three subpopulations, were studied for both metric traits (quantitative characters) and genetic markers (random amplified polymorphic DNA and one morphological, single-locus marker). Hierarchical analyses of variance components show that (1) populations are more differentiated for quantitative characters than for marker loci, (2) the contribution of both within and among subpopulations components of variance to overall genetic variance of these characters is reduced as compared to markers, and (3) at the population level, within population structure is slightly but not significantly larger for markers than for quantitative traits. Under the hypothesis that most markers are neutral, such comparisons may be used to make hypotheses about the strength and heterogeneity of natural selection in the face of genetic drift and gene flow. We thus suggest that in these populations, quantitative characters are under strong divergent selection among populations, and that gene flow is restricted among populations and subpopulations. PMID:8844165

  2. Bioinformatics for precision medicine in oncology: principles and application to the SHIVA clinical trial

    PubMed Central

    Servant, Nicolas; Roméjon, Julien; Gestraud, Pierre; La Rosa, Philippe; Lucotte, Georges; Lair, Séverine; Bernard, Virginie; Zeitouni, Bruno; Coffin, Fanny; Jules-Clément, Gérôme; Yvon, Florent; Lermine, Alban; Poullet, Patrick; Liva, Stéphane; Pook, Stuart; Popova, Tatiana; Barette, Camille; Prud’homme, François; Dick, Jean-Gabriel; Kamal, Maud; Le Tourneau, Christophe; Barillot, Emmanuel; Hupé, Philippe

    2014-01-01

    Precision medicine (PM) requires the delivery of individually adapted medical care based on the genetic characteristics of each patient and his/her tumor. The last decade witnessed the development of high-throughput technologies such as microarrays and next-generation sequencing which paved the way to PM in the field of oncology. While the cost of these technologies decreases, we are facing an exponential increase in the amount of data produced. Our ability to use this information in daily practice relies strongly on the availability of an efficient bioinformatics system that assists in the translation of knowledge from the bench towards molecular targeting and diagnosis. Clinical trials and routine diagnoses constitute different approaches, both requiring a strong bioinformatics environment capable of (i) warranting the integration and the traceability of data, (ii) ensuring the correct processing and analyses of genomic data, and (iii) applying well-defined and reproducible procedures for workflow management and decision-making. To address the issues, a seamless information system was developed at Institut Curie which facilitates the data integration and tracks in real-time the processing of individual samples. Moreover, computational pipelines were developed to identify reliably genomic alterations and mutations from the molecular profiles of each patient. After a rigorous quality control, a meaningful report is delivered to the clinicians and biologists for the therapeutic decision. The complete bioinformatics environment and the key points of its implementation are presented in the context of the SHIVA clinical trial, a multicentric randomized phase II trial comparing targeted therapy based on tumor molecular profiling versus conventional therapy in patients with refractory cancer. The numerous challenges faced in practice during the setting up and the conduct of this trial are discussed as an illustration of PM application. PMID:24910641

  3. Comparative Study of 3-Dimensional Woven Joint Architectures for Composite Spacecraft Structures

    NASA Technical Reports Server (NTRS)

    Jones, Justin S.; Polis, Daniel L.; Rowles, Russell R.; Segal, Kenneth N.

    2011-01-01

    The National Aeronautics and Space Administration (NASA) Exploration Systems Mission Directorate initiated an Advanced Composite Technology (ACT) Project through the Exploration Technology Development Program in order to support the polymer composite needs for future heavy lift launch architectures. As an example, the large composite structural applications on Ares V inspired the evaluation of advanced joining technologies, specifically 3D woven composite joints, which could be applied to segmented barrel structures needed for autoclave cured barrel segments due to autoclave size constraints. Implementation of these 3D woven joint technologies may offer enhancements in damage tolerance without sacrificing weight. However, baseline mechanical performance data is needed to properly analyze the joint stresses and subsequently design/down-select a preform architecture. Six different configurations were designed and prepared for this study; each consisting of a different combination of warp/fill fiber volume ratio and preform interlocking method (Z-fiber, fully interlocked, or hybrid). Tensile testing was performed for this study with the enhancement of a dual camera Digital Image Correlation (DIC) system which provides the capability to measure full-field strains and three dimensional displacements of objects under load. As expected, the ratio of warp/fill fiber has a direct influence on strength and modulus, with higher values measured in the direction of higher fiber volume bias. When comparing the Z-fiber weave to a fully interlocked weave with comparable fiber bias, the Z-fiber weave demonstrated the best performance in two different comparisons. We report the measured tensile strengths and moduli for test coupons from the 6 different weave configurations under study.

  4. Deep sequencing of small RNAs in plants: applied bioinformatics.

    PubMed

    Studholme, David J

    2012-01-01

    Small RNAs, including microRNA and short-interfering RNAs, play important roles in plants. In recent years, developments in sequencing technology have enabled the large-scale discovery of sRNAs in various cells, tissues and developmental stages and in response to various stresses. This review describes the bioinformatics challenges to analysing these large datasets of short-RNA sequences and some of the solutions to those challenges.

  5. A Quick Guide for Building a Successful Bioinformatics Community

    PubMed Central

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-01-01

    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  6. Integration of bioinformatics into an undergraduate biology curriculum and the impact on development of mathematical skills.

    PubMed

    Wightman, Bruce; Hark, Amy T

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative.

  7. Amination of nitroazoles--a comparative study of structural and energetic properties.

    PubMed

    Zhao, Xiuxiu; Qi, Cai; Zhang, Lubo; Wang, Yuan; Li, Shenghua; Zhao, Fengqi; Pang, Siping

    2014-01-14

    In this work, 3-nitro-1H-1,2,4-triazole (1) and 3,5-dinitro-1H-pyrazole (2) were C-aminated and N-aminated using different amination agents, yielding their respective C-amino and N-amino products. All compounds were fully characterized by NMR (1H, 13C, 15N), IR spectroscopy, differential scanning calorimetry (DSC). X-ray crystallographic measurements were performed and delivered insight into structural characteristics as well as inter- and intramolecular interactions of the products. Their impact sensitivities were measured by using standard BAM fallhammer techniques and their explosive performances were computed using the EXPLO 5.05 program. A comparative study on the influence of those different amino substituents on the structural and energetic properties (such as density, stability, heat of formation, detonation performance) is presented. The results showed that the incorporation of an N-amino group into a nitroazole ring can improve nitrogen content, heat of formation and impact sensitivity, while the introduction of a C-amino group can enhance density, detonation velocity and pressure. The potential of N-amino and C-amino moieties for the design of next generation energetic materials is explored.

  8. Comparative three-dimensional quantitative structure-activity relationship study of safeners and herbicides.

    PubMed

    Bordás, B; Kömíves, T; Szántó, Z; Lopata, A

    2000-03-01

    The competitive antagonist hypothesis for safeners and herbicides was investigated by studying the 3D similarity between 28 safener and 20 herbicide molecules in their putative biologically active, low-energy conformations using comparative molecular field analysis (CoMFA). In addition, CoMFA provided information about the structural requirements for the interactions of safeners and herbicides with a proteinaceous component (SafBP) isolated from etiolated corn seedlings. Statistically significant CoMFA models have been developed for the united and separate safener and herbicide molecule sets using retrospective binding affinity data of the ligands measured at the SafBP receptor. The predictive power of the models was characterized by squared cross-validated correlation coefficients (q(2)) of 0.708, 0.564, and 0.4000 for the united safener plus herbicide set, the safener set, and the herbicide set, respectively. The CoMFA results support the competitive antagonist hypothesis between certain types of safeners and herbicides. The findings suggest that structural similarity between these two classes of agrochemicals is a useful guide in the design of new safeners.

  9. Comparing Different Model Structures for Carbon Allocation in the Community Land Model (CLM)

    NASA Astrophysics Data System (ADS)

    Montane, F.; Fox, A. M.; Arellano, A. F.; Scaven, V. L.; Alexander, M. R.; Moore, D. J.

    2015-12-01

    Quantifying the intensity of feedback mechanisms between terrestrial ecosystems and climate is a central challenge for understanding the global carbon cycle. Part of this challenge includes understanding how climate affects not only NPP, but also C allocation in different plant tissues (leaves, stem and roots) which determines the C residence time. For instance, C could be sequestered over longer time periods if changes in climate increase allocation to long-lived plant tissue (e.g. woody components) with respect to short-lived tissues (e.g. leaves). Networks of eddy covariance towers like AmeriFlux provide the infrastructure necessary to study relationships between ecosystem processes and climate forcing. We ran the Community Land Model (CLM) for six temperate forests in North America (AmeriFlux sites) using different model structures for the C allocation module: i) standard carbon allocation module in CLM, which allocates C to the stem and leaves as a dynamic function of NPP and with fixed coefficients for the rest of parameters; ii) alternative C allocation module, which allocates C to the root and stem as a dynamic function of NPP and with fixed coefficients for the rest of parameters; and iii) alternative C allocation module with fixed coefficients for all the parameters. We compare C allocation patterns and climate sensitivities betwen the different model structures and available observations for the sites. We suggest some future approaches to reduce model uncertainty in the current scheme for C allocation in CLM and its climate sensitivity.

  10. Comparative population structure of Cynopterus fruit bats in peninsular Malaysia and southern Thailand.

    PubMed

    Campbell, Polly; Schneider, Christopher J; Adnan, Adura M; Zubaid, Akbar; Kunz, Thomas H

    2006-01-01

    The extent to which response to environmental change is mediated by species-specific ecology is an important aspect of the population histories of tropical taxa. During the Pleistocene glacial cycles and associated sea level fluctuations, the Sunda region in Southeast Asia experienced concurrent changes in landmass area and the ratio of forest to open habitat, providing an ideal setting to test the expectation that habitat associations played an important role in determining species' response to the opportunity for geographic expansion. We used mitochondrial control region sequences and six microsatellite loci to compare the phylogeographic structure and demographic histories of four broadly sympatric species of Old World fruit bats in the genus, Cynopterus. Two forest-associated species and two open-habitat generalists were sampled along a latitudinal transect in Singapore, peninsular Malaysia, and southern Thailand. Contrary to expectations based on habitat associations, the geographic scale of population structure was not concordant across ecologically similar species. We found evidence for long and relatively stable demographic history in one forest and one open-habitat species, and inferred non-coincident demographic expansions in the second forest and open-habitat species. Thus, while these results indicate that Pleistocene climate change did not have a single effect on population structure across species, a correlation between habitat association and response to environmental change was supported in only two of four species. We conclude that interactions between multiple factors, including historical and contemporary environmental change, species-specific ecology and interspecific interactions, have shaped the recent evolutionary histories of Cynopterus fruit bats in Southeast Asia.

  11. A Comparative Study of Vertebrate Corneal Structure: The Evolution of a Refractive Lens

    PubMed Central

    Winkler, Moritz; Shoa, Golroxan; Tran, Stephanie T.; Xie, Yilu; Thomasy, Sarah; Raghunathan, Vijay K.; Murphy, Christopher; Brown, Donald J.; Jester, James V.

    2015-01-01

    Purpose. Although corneal curvature plays an important role in determining the refractive power of the vertebrate eye, the mechanisms controlling corneal shape remain largely unknown. To address this question, we performed a comparative study of vertebrate corneal structure to identify potential evolutionarily based changes that correlate with the development of a corneal refractive lens. Methods. Nonlinear optical (NLO) imaging of second-harmonic–generated (SHG) signals was used to image collagen and three-dimensionally reconstruct the lamellar organization in corneas from different vertebrate clades. Results. Second-harmonic–generated images taken normal to the corneal surface showed that corneal collagen in all nonmammalian vertebrates was organized into sheets (fish and amphibians) or ribbons (reptiles and birds) extending from limbus to limbus that were oriented nearly orthogonal (ranging from 77.7°–88.2°) to their neighbors. The slight angular offset (2°–13°) created a rotational pattern that continued throughout the full thickness in fish and amphibians and to the very posterior layers in reptiles and birds. Interactions between lamellae were limited to “sutural” fibers in cartilaginous fish, and occasional lamellar branching in fish and amphibians. There was a marked increase in lamellar branching in higher vertebrates, such that birds ≫ reptiles > amphibians > fish. By contrast, mammalian corneas showed a nearly random collagen fiber organization with no orthogonal, chiral pattern. Conclusions. Our data indicate that nonmammalian vertebrate corneas share a common orthogonal collagen structural organization that shows increased lamellar branching in higher vertebrate species. Importantly, mammalian corneas showed a different structural organization, suggesting a divergent evolutionary background. PMID:26066606

  12. Bitter sweeteners: tetrazole derivatives of arylsulfonylalcanoids--synthesis, structure and comparative study.

    PubMed

    Kalinowska-Tłuścik, Justyna; Jarzembek, Krystyna; Sliwiński, Jan; Oleksyn, Barbara J; Kozik, Violetta; Polański, Jarosław

    2008-12-01

    Within a research project aimed at the design of new sweeteners, the tetrazole moiety was introduced to arylsulfonylalkanoic acids (ASA) as a bioisostere of the carboxyl group. The crystal structures of four newly synthesized tetrazole derivatives and one intermediate product of the reaction were determined in order to explain the bitter taste of these compounds. Three chiral compounds crystallize as racemic mixtures in centrosymmetric space groups of the monoclinic system, whereas the non-chiral compound, with a higher dipole moment, crystallizes in the polar space group Cc. Intermolecular N-H...N hydrogen bonds between tetrazole moieties were observed in all four structures and are compared with the analogous interactions observed in tetrazole derivatives deposited in the Cambridge Structural Database (CSD). Specifically, the typical N1-H...N4 as well as N1-H...N3 interactions, which are less abundant in the CSD, are described. The formation of the latter interaction type can be hypothetically explained by an asymmetry of pi-electron distribution in the tetrazole rings caused by the crystalline environment. Important features of the crystal architecture are the chains of molecules linked by N-H...N bonds. A possible reason for the lack of a sweet taste of the tetrazoles investigated may be the improper position of the tetrazole H atom, and the mutual orientation of the proton donor and acceptor in their molecules. This orientation does not allow the tetrazoles to interact with the sweet-taste receptor in a way similar to that of ASA. The bitter taste of the investigated compounds needs further study.

  13. In Vivo validation of a bioinformatics based tool to identify reduced replication capacity in HIV-1.

    PubMed

    Kitchen, Christina M R; Krogstad, Paul; Kitchen, Scott G

    2010-01-01

    Although antiretroviral drug resistance is common in treated HIV infected individuals, it is not a consistent indicator of HIV morbidity and mortality. To the contrary, HIV resistance-associated mutations may lead to changes in viral fitness that are beneficial to infected individuals. Using a bioinformatics-based model to assess the effects of numerous drug resistance mutations, we determined that the D30N mutation in HIV-1 protease had the largest decrease in replication capacity among known protease resistance mutations. To test this in silico result in an in vivo environment, we constructed several drug-resistant mutant HIV-1 strains and compared their relative fitness utilizing the SCID-hu mouse model. We found HIV-1 containing the D30N mutation had a significant defect in vivo, showing impaired replication kinetics and a decreased ability to deplete CD4+ thymocytes, compared to the wild-type or virus without the D30N mutation. In comparison, virus containing the M184V mutation in reverse transcriptase, which shows decreased replication capacity in vitro, did not have an effect on viral fitness in vivo. Thus, in this study we have verified an in silico bioinformatics result with a biological assessment to identify a unique mutation in HIV-1 that has a significant fitness defect in vivo.

  14. PATRIC: the Comprehensive Bacterial Bioinformatics Resource with a Focus on Human Pathogenic Species ▿ ‡ #

    PubMed Central

    Gillespie, Joseph J.; Wattam, Alice R.; Cammer, Stephen A.; Gabbard, Joseph L.; Shukla, Maulik P.; Dalay, Oral; Driscoll, Timothy; Hix, Deborah; Mane, Shrinivasrao P.; Mao, Chunhong; Nordberg, Eric K.; Scott, Mark; Schulman, Julie R.; Snyder, Eric E.; Sullivan, Daniel E.; Wang, Chunxia; Warren, Andrew; Williams, Kelly P.; Xue, Tian; Seung Yoo, Hyun; Zhang, Chengdong; Zhang, Yan; Will, Rebecca; Kenyon, Ronald W.; Sobral, Bruno W.

    2011-01-01

    Funded by the National Institute of Allergy and Infectious Diseases, the Pathosystems Resource Integration Center (PATRIC) is a genomics-centric relational database and bioinformatics resource designed to assist scientists in infectious-disease research. Specifically, PATRIC provides scientists with (i) a comprehensive bacterial genomics database, (ii) a plethora of associated data relevant to genomic analysis, and (iii) an extensive suite of computational tools and platforms for bioinformatics analysis. While the primary aim of PATRIC is to advance the knowledge underlying the biology of human pathogens, all publicly available genome-scale data for bacteria are compiled and continually updated, thereby enabling comparative analyses to reveal the basis for differences between infectious free-living and commensal species. Herein we summarize the major features available at PATRIC, dividing the resources into two major categories: (i) organisms, genomes, and comparative genomics and (ii) recurrent integration of community-derived associated data. Additionally, we present two experimental designs typical of bacterial genomics research and report on the execution of both projects using only PATRIC data and tools. These applications encompass a broad range of the data and analysis tools available, illustrating practical uses of PATRIC for the biologist. Finally, a summary of PATRIC's outreach activities, collaborative endeavors, and future research directions is provided. PMID:21896772

  15. The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity

    PubMed Central

    Ziemert, Nadine; Podell, Sheila; Penn, Kevin; Badger, Jonathan H.; Allen, Eric; Jensen, Paul R.

    2012-01-01

    New bioinformatic tools are needed to analyze the growing volume of DNA sequence data. This is especially true in the case of secondary metabolite biosynthesis, where the highly repetitive nature of the associated genes creates major challenges for accurate sequence assembly and analysis. Here we introduce the web tool Natural Product Domain Seeker (NaPDoS), which provides an automated method to assess the secondary metabolite biosynthetic gene diversity and novelty of strains or environments. NaPDoS analyses are based on the phylogenetic relationships of sequence tags derived from polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) genes, respectively. The sequence tags correspond to PKS-derived ketosynthase domains and NRPS-derived condensation domains and are compared to an internal database of experimentally characterized biosynthetic genes. NaPDoS provides a rapid mechanism to extract and classify ketosynthase and condensation domains from PCR products, genomes, and metagenomic datasets. Close database matches provide a mechanism to infer the generalized structures of secondary metabolites while new phylogenetic lineages provide targets for the discovery of new enzyme architectures or mechanisms of secondary metabolite assembly. Here we outline the main features of NaPDoS and test it on four draft genome sequences and two metagenomic datasets. The results provide a rapid method to assess secondary metabolite biosynthetic gene diversity and richness in organisms or environments and a mechanism to identify genes that may be associated with uncharacterized biochemistry. PMID:22479523

  16. In the loop: promoter–enhancer interactions and bioinformatics

    PubMed Central

    Mora, Antonio; Sandve, Geir Kjetil; Gabrielsen, Odd Stokke

    2016-01-01

    Enhancer–promoter regulation is a fundamental mechanism underlying differential transcriptional regulation. Spatial chromatin organization brings remote enhancers in contact with target promoters in cis to regulate gene expression. There is considerable evidence for promoter–enhancer interactions (PEIs). In the recent years, genome-wide analyses have identified signatures and mapped novel enhancers; however, being able to precisely identify their target gene(s) requires massive biological and bioinformatics efforts. In this review, we give a short overview of the chromatin landscape and transcriptional regulation. We discuss some key concepts and problems related to chromatin interaction detection technologies, and emerging knowledge from genome-wide chromatin interaction data sets. Then, we critically review different types of bioinformatics analysis methods and tools related to representation and visualization of PEI data, raw data processing and PEI prediction. Lastly, we provide specific examples of how PEIs have been used to elucidate a functional role of non-coding single-nucleotide polymorphisms. The topic is at the forefront of epigenetic research, and by highlighting some future bioinformatics challenges in the field, this review provides a comprehensive background for future PEI studies. PMID:26586731

  17. Data capture in bioinformatics: requirements and experiences with Pedro

    PubMed Central

    Jameson, Daniel; Garwood, Kevin; Garwood, Chris; Booth, Tim; Alper, Pinar; Oliver, Stephen G; Paton, Norman W

    2008-01-01

    Background The systematic capture of appropriately annotated experimental data is a prerequisite for most bioinformatics analyses. Data capture is required not only for submission of data to public repositories, but also to underpin integrated analysis, archiving, and sharing – both within laboratories and in collaborative projects. The widespread requirement to capture data means that data capture and annotation are taking place at many sites, but the small scale of the literature on tools, techniques and experiences suggests that there is work to be done to identify good practice and reduce duplication of effort. Results This paper reports on experience gained in the deployment of the Pedro data capture tool in a range of representative bioinformatics applications. The paper makes explicit the requirements that have recurred when capturing data in different contexts, indicates how these requirements are addressed in Pedro, and describes case studies that illustrate where the requirements have arisen in practice. Conclusion Data capture is a fundamental activity for bioinformatics; all biological data resources build on some form of data capture activity, and many require a blend of import, analysis and annotation. Recurring requirements in data capture suggest that model-driven architectures can be used to construct data capture infrastructures that can be rapidly configured to meet the needs of individual use cases. We have described how one such model-driven infrastructure, namely Pedro, has been deployed in representative case studies, and discussed the extent to which the model-driven approach has been effective in practice. PMID:18402673

  18. Best practices in bioinformatics training for life scientists

    PubMed Central

    Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D.; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L.; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C.; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K.

    2013-01-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301

  19. A primer to frequent itemset mining for bioinformatics

    PubMed Central

    Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart

    2015-01-01

    Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. PMID:24162173

  20. A primer to frequent itemset mining for bioinformatics.

    PubMed

    Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart; Laukens, Kris

    2015-03-01

    Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences.

  1. Best practices in bioinformatics training for life scientists.

    PubMed

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-09-01

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  2. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    NASA Astrophysics Data System (ADS)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  3. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    PubMed Central

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  4. Bioinformatics methods for the analysis of hepatitis viruses.

    PubMed

    Moriconi, Francesco; Beard, Michael R; Yuen, Lilly Kw

    2013-01-01

    HBV and HCV are the only hepatotropic viruses capable of establishing chronic infections. More than 500 million people worldwide are estimated to have chronic infections with HBV and/or HCV, and they have an increased risk of developing liver complications, such as cirrhosis or hepatocellular carcinoma. During the past decade, several antiviral agents including immune-modulatory drugs and nucleoside/nucleotide analogues have been approved for the treatment of HBV and HCV infections. In recent years, the focus has been on the development of new and better therapeutic agents for management of chronic HCV infections. Bioinformatics has only been applied recently to the field of viral hepatitis research. In addition to the wide range of general tools freely available for identification of open reading frames, gene prediction, homology searching, sequence alignment, and motif and epitope recognition, several public database systems designed specifically for HBV and HCV research have now been developed. The focus of these databases ranged from being viral sequence repositories for the provision of bioinformatics tools for viral genome analysis, as well as HBV or HCV drug resistance prediction. This review provides an overview of these public databases, which have integrated bioinformatics tools for HBV and HCV research. Properly managed and developed, these databases have the potential to have a broad effect on hepatitis research and treatment strategies. However, the effect will depend on the comprehensive collection of not only molecular sequence data, but also anonymous patient clinical and treatment data.

  5. A new structure for comparing surface passivation materials of GaAs solar cells

    NASA Technical Reports Server (NTRS)

    Desalvo, Gregory C.; Barnett, Allen M.

    1989-01-01

    The surface recombination velocity (S sub rec) for bare GaAs is typically as high as 10 to the 6th power to 10 to the 7th power cm/sec, which dramatically lowers the efficiency of GaAs solar cells. Early attempts to circumvent this problem by making an ultra thin junction (xj less than .1 micron) proved unsuccessful when compared to lowering S sub rec by surface passivation. Present day GaAs solar cells use an GaAlAs window layer to passivate the top surface. The advantages of GaAlAs in surface passivation are its high bandgap energy and lattice matching to GaAs. Although GaAlAs is successful in reducing the surface recombination velocity, it has other inherent problems of chemical instability (Al readily oxidizes) and ohmic contact formation. The search for new, more stable window layer materials requires a means to compare their surface passivation ability. Therefore, a device structure is needed to easily test the performance of different passivating candidates. Such a test device is described.

  6. Comparative proteomic analysis of Litopenaeus vannamei gills after vaccination with two WSSV structural proteins.

    PubMed

    Chen, Li-Hao; Lin, Shi-Wei; Liu, Kuan-Fu; Chang, Chin-I; Hseu, Jinn-Rong; Tsai, Jyh-Ming

    2016-02-01

    White spot syndrome virus (WSSV) is one of the most devastating viral pathogens of cultured shrimp worldwide. Recently published papers show the ability of WSSV structural protein VP28 to vaccinate shrimp and raise protection against the virus. This study attempted to identify the joining proteins of the aforementioned shrimp quasi-immune response by proteomic analysis. The other envelope protein, VP36B, was used as the non-protective subunit vaccine control. Shrimp were intramuscularly injected with rVPs or PBS on day 1 and day 4 and then on day 7 their gill tissues were sampled. The two-dimensional electrophoresis (2-DE) patterns of gill proteins between vaccinated and PBS groups were compared and 20 differentially expressed proteins identified by mass spectrometry, some of which were validated in gill and hemocyte tissues using real-time quantitative RT-PCR. Many of identified proteins and their expression levels also linked with the shrimp response during WSSV infection. The list of up-regulated protein spots found exclusively in rVP28-vaccinated shrimp include calreticulin and heat shock protein 70 with chaperone properties, ubiquitin, and others. The two serine proteases, chymotrypsin and trypsin, were significantly increased in shrimp of both vaccinated groups compared to PBS controls. The information presented here should be useful for gaining insight into invertebrate immunity.

  7. Comparative host-parasite population genetic structures: obligate fly ectoparasites on Galapagos seabirds.

    PubMed

    Levin, Iris I; Parker, Patricia G

    2013-08-01

    Parasites often have shorter generation times and, in some cases, faster mutation rates than their hosts, which can lead to greater population differentiation in the parasite relative to the host. Here we present a population genetic study of two ectoparasitic flies, Olfersia spinifera and Olfersia aenescens compared with their respective bird hosts, great frigatebirds (Fregata minor) and Nazca boobies (Sula granti). Olfersia spinifera is the vector of a haemosporidian parasite, Haemoproteus iwa, which infects frigatebirds throughout their range. Interestingly, there is no genetic differentiation in the haemosporidian parasite across this range despite strong genetic differentiation between Galapagos frigatebirds and their non-Galapagos conspecifics. It is possible that the broad distribution of this one H. iwa lineage could be facilitated by movement of infected O. spinifera. Therefore, we predicted more gene flow in both fly species compared with the bird hosts. Mitochondrial DNA sequence data from three genes per species indicated that despite marked differences in the genetic structure of the bird hosts, gene flow was very high in both fly species. A likely explanation involves non-breeding movements of hosts, including movement of juveniles, and movement by adult birds whose breeding attempt has failed, although we cannot rule out the possibility that closely related host species may be involved.

  8. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation

    PubMed Central

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-01-01

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  9. Comparing Anesthesiology Residency Training Structure and Requirements in Seven Different Countries on Three Continents

    PubMed Central

    Tanaka, Pedro; Madsen, Matias V; Macario, Alex

    2017-01-01

    Little has been published comparing the graduate medical education training structure and requirements across multiple countries. The goal of this study was to summarize and compare the characteristics of anesthesiology training programs in the USA, UK, Canada, Japan, Brazil, Denmark, and Switzerland as a way to better understand efforts to train anesthesiologists in different countries. Two physicians trained in each of the seven countries (convenience sample) were interviewed using a semi-structured approach. The interview was facilitated by use of a predetermined questionnaire that included, for example, the duration of post-medical school training and national requirements for certain rotations, a number of cases, faculty supervision, national in-training written exams, and duty hour limits. These data were augmented by review of each country’s publicly available residency training documents as available on the internet. Post-medical school anesthesia residency duration varied: three years (Brazil), four years (USA), five years (Canada and Switzerland), six years (Japan and Denmark) to nine years (UK), as did the number of explicitly required clinical rotations of a defined duration: zero (Denmark), one (Switzerland and UK), four (Brazil), six (Canada), and 12 (USA). Minimum case requirements exist in the USA, Japan, and Brazil, but not in the other countries. National written exams taken during training exist for all countries studied except Japan and Denmark. The countries studied increasingly aim to have competency-based education with milestone assessments. Training duty hour limits also varied including for example 37 hours/week averaged over a one month with limitations on night duties (Denmark), a weekly average of 48 hours taken over a 17 week period (UK), 50 hours/week maximum (Switzerland), 60 hours/week maximum (Brazil), and 80 hours/week averaged over four weeks (USA). Some countries have highly structured training programs with multiple national

  10. [Comparative Study on the Molecular Structures and Spectral Properties of Ponceau 4R and Amaranth].

    PubMed

    Zhang, Yong; Chen, Guo-qing; Zhu, Chun; Hu, Yang-jun

    2015-11-01

    The Edinburgh FLS920P steady-instantaneous fluorescence spectrometer was applied on the detection of the absorption and the emission spectra of ponceau 4R and amaranth, which are isomers to each other. After that, the spectral parameters of them were compared. Then, the density functional theory (DFT) and time-dependent density functional theory (TD-DFT) were used on the optimization of ponceau 4R and amaranth under the ground and excited state, respectively, in order to compare the differences in configurations of them under different states. On the base of the results above, the absorption and emission spectra of the two isomers were calculated with TD-DFT, and the polarized continuum model (PCM) was applied on the base of 6-311++G (d, p). The fluorescence mechanism, the relationships between the properties of fluorescence spectra and the molecular geometry were all analyzed. The results shows that, the structures of the two molecules are non-planar, these two naphthalene rings are not co-planar, respectively, and there's hydrogen bond in amaranth. When the two isomers were on the ground state, the planarity of the naphthalene ring which exists the hydrogen bond mentioned above in amaranth is better than the corresponding part of ponceau 4R. The two isomers are nearly co-planar when they're on the excited state. The molecular structures of ponceau 4R and amaranth optimized above are basically reasonable, for the quantum chemistry calculation spectral results are agree with the experiments. The planarity of the naphthalene rings on the right side in ponceau 4R is worse than that in amaranth, the ponceau 4R molecule experienced more vibration and rotation from the excited to the ground state, lost more energy, which lead to the reduction of energy for emitting fluorescent photons. So ponceau 4R has longer fluorescence emission wave- length than amaranth. In this paper, the molecular structure information of ponceau 4R and amaranth were obtained, and the differences

  11. Comparative Genome Analyses Reveal Distinct Structure in the Saltwater Crocodile MHC

    PubMed Central

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M.; Shan, Xueyan; Peterson, Daniel G.; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M.; Isberg, Sally R.; Higgins, Damien P.; Chong, Amanda Y.; John, John St; Glenn, Travis C.; Ray, David A.; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2–6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  12. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    PubMed

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M; Shan, Xueyan; Peterson, Daniel G; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M; Isberg, Sally R; Higgins, Damien P; Chong, Amanda Y; John, John St; Glenn, Travis C; Ray, David A; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs.

  13. Freshwater Metaviromics and Bacteriophages: A Current Assessment of the State of the Art in Relation to Bioinformatic Challenges

    PubMed Central

    Bruder, Katherine; Malki, Kema; Cooper, Alexandria; Sible, Emily; Shapiro, Jason W.; Watkins, Siobhan C.; Putonti, Catherine

    2016-01-01

    Advances in bioinformatics and sequencing technologies have allowed for the analysis of complex microbial communities at an unprecedented rate. While much focus is often placed on the cellular members of these communities, viruses play a pivotal role, particularly bacteria-infecting viruses (bacteriophages); phages mediate global biogeochemical processes and drive microbial evolution through bacterial grazing and horizontal gene transfer. Despite their importance and ubiquity in nature, very little is known about the diversity and structure of viral communities. Though the need for culture-based methods for viral identification has been somewhat circumvented through metagenomic techniques, the analysis of metaviromic data is marred with many unique issues. In this review, we examine the current bioinformatic approaches for metavirome analyses and the inherent challenges facing the field as illustrated by the ongoing efforts in the exploration of freshwater phage populations. PMID:27375355

  14. A framework for comparing structural and functional measures of glaucomatous damage

    PubMed Central

    Hood, Donald C.; Kardon, Randy H.

    2007-01-01

    While it is often said that structural damage due to glaucoma precedes functional damage, it is not always clear what this statement means. This review has two purposes: first, to show that a simple linear relationship describes the data relating a particular functional test (standard automated perimetry (SAP)) to a particular structural test (optical coherence tomography (OCT)); and, second, to propose a general framework for relating structural and functional damage, and for evaluating if one precedes the other. The specific functional and structural tests employed are described in Section 2. To compare SAP sensitivity loss to loss of the retinal nerve fiber layer (RNFL) requires a map that relates local field regions to local regions of the optic disc as described in Section 3. When RNFL thickness in the superior and inferior arcuate sectors of the disc are plotted against SAP sensitivity loss (dB units) in the corresponding arcuate regions of the visual field, RNFL thickness becomes asymptotic for sensitivity losses greater than about 10 dB. These data are well described by a simple linear model presented in Section 4. The model assumes that the RNFL thickness measured with OCT has two components. One component is the axons of the retinal ganglion cells and the other, the residual, is everything else (e.g. glial cells, blood vessels). The axon portion is assumed to decrease in a linear fashion with losses in SAP sensitivity (in linear units); the residual portion is assumed to remain constant. Based upon severe SAP losses in anterior ischemic optic neuropathy (AION), the residual RNFL thickness in the arcuate regions is, on average, about one-third of the premorbid (normal) thickness of that region. The model also predicts that, to a first approximation, SAP sensitivity in control subjects does not depend upon RNFL thickness. The data (Section 6) are, in general, consistent with this prediction showing a very weak correlation between RNFL thickness and SAP

  15. Bioinformatic analysis of non-VP1 capsid protein of coxsackievirus A6.

    PubMed

    Liu, Hong-Bo; Yang, Guang-Fei; Liang, Si-Jia; Lin, Jun

    2016-08-01

    This study bioinformatically analyzed the non-VP1 capsid proteins (VP2-VP4) of Coxasckievirus A6 (CVA6), with an attempt to predict their basic physicochemical properties, structural/functional features and linear B cell eiptopes. The online tools SubLoc, TargetP and the others from ExPASy Bioinformatics Resource Portal, and SWISS-MODEL (an online protein structure modeling server), were utilized to analyze the amino acid (AA) sequences of VP2-VP4 proteins of CVA6. Our results showed that the VP proteins of CVA6 were all of hydrophilic nature, contained phosphorylation and glycosylation sites and harbored no signal peptide sequences and acetylation sites. Except VP3, the other proteins did not have transmembrane helix structure and nuclear localization signal sequences. Random coils were the major conformation of the secondary structure of the capsid proteins. Analysis of the linear B cell epitopes by employing Bepipred showed that the average antigenic indices (AI) of individual VP proteins were all greater than 0 and the average AI of VP4 was substantially higher than that of VP2 and VP3. The VP proteins all contained a number of potential B cell epitopes and some eiptopes were located at the internal side of the viral capsid or were buried. We successfully predicted the fundamental physicochemical properties, structural/functional features and the linear B cell eiptopes and found that different VP proteins share some common features and each has its unique attributes. These findings will help us understand the pathogenicity of CVA6 and develop related vaccines and immunodiagnostic reagents.

  16. Discovery of C-Glycosylpyranonaphthoquinones in Streptomyces sp. MBT76 by a Combined NMR-Based Metabolomics and Bioinformatics Workflow.

    PubMed

    Wu, Changsheng; Du, Chao; Ichinose, Koji; Choi, Young Hae; van Wezel, Gilles P

    2017-02-24

    Mining of microbial genomes has revealed that actinomycetes harbor far more biosynthetic potential for bioactive natural products than anticipated. Activation of (cryptic) biosynthetic gene clusters and identification of the corresponding metabolites has become a focal point for drug discovery. Here, we applied NMR-based metabolomics combined with bioinformatics to identify novel C-glycosylpyranonaphthoquinones in Streptomyces sp. MBT76 and to elucidate the biosynthetic pathway. Following activation of the cryptic qin gene cluster for a type II polyketide synthase (PKS) by constitutive expression of its pathway-specific activator, bioinformatics coupled to NMR profiling facilitated the chromatographic isolation and structural elucidation of qinimycins A-C (1-3). The intriguing structural features of the qinimycins, including 8-C-glycosylation, 5,14-epoxidation, and 13-hydroxylation, distinguished these molecules from the model pyranonaphthoquinones actinorhodin, medermycin, and granaticin. Another novelty lies in the unusual fusion of a deoxyaminosugar to the pyranonaphthoquinone backbone during biosynthesis of the antibiotics BE-54238 A and B (4, 5). Qinimycins showed weak antimicrobial activity against Gram-positive bacteria. Our work shows the utility of combining bioinformatics, targeted activation of cryptic gene clusters, and NMR-based metabolic profiling as an effective pipeline for the discovery of microbial natural products with distinctive skeletons.

  17. Correspondence regarding Zhong et al., BMC Bioinformatics 2013 Mar 7;14:89.

    PubMed

    Kuhn, Alexandre

    2014-11-28

    Computational expression deconvolution aims to estimate the contribution of individual cell populations to expression profiles measured in samples of heterogeneous composition. Zhong et al. recently proposed Digital Sorting Algorithm (BMC Bioinformatics 2013 Mar 7;14:89) and showed that they could accurately estimate population-specific expression levels and expression differences between two populations. They compared DSA with Population-Specific Expression Analysis (PSEA), a previous deconvolution method that we developed to detect expression changes occurring within the same population between two conditions (e.g. disease versus non-disease). However, Zhong et al. compared PSEA-derived specific expression levels across different cell populations. Specific expression levels obtained with PSEA cannot be directly compared across different populations as they are on a relative scale. They are accurate as we demonstrate by deconvolving the same dataset used by Zhong et al. and, importantly, allow for comparison of population-specific expression across conditions.

  18. Using Bioinformatics Approach to Explore the Pharmacological Mechanisms of Multiple Ingredients in Shuang-Huang-Lian

    PubMed Central

    Zhang, Bai-xia; Li, Jian; Gu, Hao; Li, Qiang; Zhang, Qi; Zhang, Tian-jiao; Wang, Yun; Cai, Cheng-ke

    2015-01-01

    Due to the proved clinical efficacy, Shuang-Huang-Lian (SHL) has developed a variety of dosage forms. However, the in-depth research on targets and pharmacological mechanisms of SHL preparations was scarce. In the presented study, the bioinformatics approaches were adopted to integrate relevant data and biological information. As a result, a PPI network was built and the common topological parameters were characterized. The results suggested that the PPI network of SHL exhibited a scale-free property and modular architecture. The drug target network of SHL was structured with 21 functional modules. According to certain modules and pharmacological effects distribution, an antitumor effect and potential drug targets were predicted. A biological network which contained 26 subnetworks was constructed to elucidate the antipneumonia mechanism of SHL. We also extracted the subnetwork to explicitly display the pathway where one effective component acts on the pneumonia related targets. In conclusions, a bioinformatics approach was established for exploring the drug targets, pharmacological activity distribution, effective components of SHL, and its mechanism of antipneumonia. Above all, we identified the effective components and disclosed the mechanism of SHL from the view of system. PMID:26495421

  19. Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

    PubMed Central

    Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas

    2014-01-01

    Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727

  20. Speedup bioinformatics applications on multicore-based processor using vectorizing and multithreading strategies.

    PubMed

    Chaichoompu, Kridsadakorn; Kittitornkun, Surin; Tongsima, Sissades

    2007-12-30

    Many computational intensive bioinformatics software, such as multiple sequence alignment, population structure analysis, etc., written in C/C++ are not multicore-aware. A multicore processor is an emerging CPU technology that combines two or more independent processors into a single package. The Single Instruction Multiple Data-stream (SIMD) paradigm is heavily utilized in this class of processors. Nevertheless, most popular compilers including Microsoft Visual C/C++ 6.0, x86 gnu C-compiler gcc do not automatically create SIMD code which can fully utilize the advancement of these processors. To harness the power of the new multicore architecture certain compiler techniques must be considered. This paper presents a generic compiling strategy to assist the compiler in improving the performance of bioinformatics applications written in C/C++. The proposed framework contains 2 main steps: multithreading and vectorizing strategies. After following the strategies, the application can achieve higher speedup by taking the advantage of multicore architecture technology. Due to the extremely fast interconnection networking among multiple cores, it is suggested that the proposed optimization could be more appropriate than making use of parallelization on a small cluster computer which has larger network latency and lower bandwidth.

  1. The RHNumtS compilation: Features and bioinformatics approaches to locate and quantify Human NumtS

    PubMed Central

    Lascaro, Daniela; Castellana, Stefano; Gasparre, Giuseppe; Romeo, Giovanni; Saccone, Cecilia; Attimonelli, Marcella

    2008-01-01

    Background To a greater or lesser extent, eukaryotic nuclear genomes contain fragments of their mitochondrial genome counterpart, deriving from the random insertion of damaged mtDNA fragments. NumtS (Nuclear mt Sequences) are not equally abundant in all species, and are redundant and polymorphic in terms of copy number. In population and clinical genetics, it is important to have a complete overview of NumtS quantity and location. Searching PubMed for NumtS or Mitochondrial pseudo-genes yields hundreds of papers reporting Human NumtS compilations produced by in silico or wet-lab approaches. A comparison of published compilations clearly shows significant discrepancies among data, due both to unwise application of Bioinformatics methods and to a not yet correctly assembled nuclear genome. To optimize quantification and location of NumtS, we produced a consensus compilation of Human NumtS by applying various bioinformatics approaches. Results Location and quantification of NumtS may be achieved by applying database similarity searching methods: we have applied various methods such as Blastn, MegaBlast and BLAT, changing both parameters and database; the results were compared, further analysed and checked against the already published compilations, thus producing the Reference Human Numt Sequences (RHNumtS) compilation. The resulting NumtS total 190. Conclusion The RHNumtS compilation represents a highly reliable reference basis, which may allow designing a lab protocol to test the actual existence of each NumtS. Here we report preliminary results based on PCR amplification and sequencing on 41 NumtS selected from RHNumtS among those with lower score. In parallel, we are currently designing the RHNumtS database structure for implementation in the HmtDB resource. In the future, the same database will host NumtS compilations from other organisms, but these will be generated only when the nuclear genome of a specific organism has reached a high-quality level of assembly

  2. Versatility of the Burkholderia cepacia Complex for the Biosynthesis of Exopolysaccharides: A Comparative Structural Investigation

    PubMed Central

    Silipo, Alba; Lanzetta, Rosa; Liut, Gianfranco; Rizzo, Roberto; Cescutti, Paola

    2014-01-01

    The Burkholderia cepacia Complex assembles at least eighteen closely related species that are ubiquitous in nature. Some isolates show beneficial potential for biocontrol, bioremediation and plant growth promotion. On the contrary, other strains are pathogens for plants and immunocompromised individuals, like cystic fibrosis patients. In these subjects, they can cause respiratory tract infections sometimes characterised by fatal outcome. Most of the Burkholderia cepacia Complex species are mucoid when grown on a mannitol rich medium and they also form biofilms, two related characteristics, since polysaccharides are important component of biofilm matrices. Moreover, polysaccharides contribute to bacterial survival in a hostile environment by inhibiting both neutrophils chemotaxis and antimicrobial peptides activity, and by scavenging reactive oxygen species. The ability of these microorganisms to produce exopolysaccharides with different structures is testified by numerous articles in the literature. However, little is known about the type of polysaccharides produced in biofilms and their relationship with those obtained in non-biofilm conditions. The aim of this study was to define the type of exopolysaccharides produced by nine species of the Burkholderia cepacia Complex. Two isolates were then selected to compare the polysaccharides produced on agar plates with those formed in biofilms developed on cellulose membranes. The investigation was conducted using NMR spectroscopy, high performance size exclusion chromatography, and gas chromatography coupled to mass spectrometry. The results showed that the Complex is capable of producing a variety of exopolysaccharides, most often in mixture, and that the most common exopolysaccharide is always cepacian. In addition, two novel polysaccharide structures were determined: one composed of mannose and rhamnose and another containing galactose and glucuronic acid. Comparison of exopolysaccharides obtained from cultures on

  3. Comparative electronic structure of a lanthanide and actinide diatomic oxide: Nd versus U

    NASA Astrophysics Data System (ADS)

    Krauss, M.; Stevens, W. J.

    2003-01-01

    Using a modified version of the Alchemy electronic structure code and relativistic pseudopotentials, the electronic structure of the ground and low lying excited states of UO, NdO, and NdO + have been calculated at the Hartree-Fock (HF) and multiconfiguration self-consistent field (MCSCF) levels of theory. Including results from an earlier study of UO + this provides the information for a comparative analysis of a lanthanide and an actinide diatomic oxide. UO and NdO are both described formally as M +2 O -2 and the cations as M +3 O -2 , but the HF and MCSCF calculations show that these systems are considerably less ionic due to large charge back-transfer in the πorbitals. The electronic states putatively arise from the ligand field (oxygen anion) perturbed f 4 , sf 3 , df 3 , sdf 2 , or s 2 f 2 states of M +2 and f 3 , sf 2 or df 2 states of M +3 . Molecular orbital results show a substantial stabilization of the sf 3 or s 2 f 2 configurations relative to the f 4 or df 3 configurations that are the even or odd parity ground states in the M +2 free ion. The compact f and d orbitals are more destabilized by the anion field than the diffuse s orbital. The ground states of the neutral species are dominated by orbitals arising from the M +2 sf 3 term, and all the potential energy curves arising from this configuration are similar, which allows an estimate of the vibrational frequencies for UO and NdO of 862 cm -1 and 836 cm -1 , respectively. For NdO + and UO + the excitation energies for the Ωstates were calculated with a valence configuration interaction method using ab initio effective spin-orbit operators to couple the molecular orbital configurations. The results for NdO + are very comparable with the results for UO + , and show the vibrational and electronic states to be interleaved.

  4. Tools and data services registry: a community effort to document bioinformatics resources

    PubMed Central

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C.E.; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  5. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses

    PubMed Central

    Tian, Tian; Yang, Mifang; Li, Zhongming; Ping, Fengfeng; Fan, Weixin

    2016-01-01

    Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days). Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease. PMID:26752403

  6. Genetic diversity at the Dhn3 locus in Turkish Hordeum spontaneum populations with comparative structural analyses

    PubMed Central

    Uçarlı, Cüneyt; McGuffin, Liam J.; Çaputlu, Süleyman; Aravena, Andres; Gürel, Filiz

    2016-01-01

    We analysed Hordeum spontaneum accessions from 21 different locations to understand the genetic diversity of HsDhn3 alleles and effects of single base mutations on the intrinsically disordered structure of the resulting polypeptide (HsDHN3). HsDHN3 was found to be YSK2-type with a low-frequency 6-aa deletion in the beginning of Exon 1. There is relatively high diversity in the intron region of HsDhn3 compared to the two exon regions. We have found subtle differences in K segments led to changes in amino acids chemical properties. Predictions for protein interaction profiles suggest the presence of a protein-binding site in HsDHN3 that coincides with the K1 segment. Comparison of DHN3 to closely related cereals showed that all of them contain a nuclear localization signal sequence flanking to the K1 segment and a novel conserved region located between the S and K1 segments [E(D/T)DGMGGR]. We found that H. vulgare, H. spontaneum, and Triticum urartu DHN3s have a greater number of phosphorylation sites for protein kinase C than other cereal species, which may be related to stress adaptation. Our results show that the nature and extent of mutations in the conserved segments of K1 and K2 are likely to be key factors in protection of cells. PMID:26869072

  7. Comparative structural, emulsifying, and biological properties of 2 major canola proteins, cruciferin and napin.

    PubMed

    Wu, J; Muir, A D

    2008-04-01

    Canola is an economically important farm-gate crop in Canada. To further explore the potential of canola protein as value-added food and nutraceutical ingredients, a better understanding of fundamental properties of 2 major canola proteins is necessary. Two major protein components, cruciferin and napin, were isolated from defatted canola meal by Sephacryl S-300 gel filtration chromatography. SDS-PAGE showed that cruciferin consists of more than 10 polypeptides, and noncovalent links are more important than disulphide bonds in stabilizing the structural conformation. Napin consists of 2 polypeptides and is stabilized primarily by disulphide bonds. Purified cruciferin showed 1 major endothermic peak at 91 degrees C compared with that of 110 degrees C for napin. Emulsion prepared by cruciferin showed significant higher specific surface area and lower particle size than that of napin. The study indicated that the presence of napin could detrimentally affect the emulsion stability of canola protein isolates. Hydrolysates from cruciferin and napin showed potent angiotensin I-converting enzyme inhibitory activity (IC(50): 0.035 and 0.029 mg/mL, respectively), but weaker than that of canola protein isolate hydrolysate (IC(50): 0.015 mg/mL).

  8. Structure-function relationship of Chikungunya nsP2 protease: A comparative study with papain.

    PubMed

    Ramakrishnan, Chandrasekaran; Kutumbarao, Nidamarthi H V; Suhitha, Sivasubramanian; Velmurugan, Devadasan

    2016-11-07

    Chikungunya virus is a growing human pathogen transmitted by mosquito bite. It causes fever, chills, nausea, vomiting, joint pain, headache, and swelling in the joints. Its replication and propagation depend on the protease activity of the Chikungunya virus-nsP2 protein, which cleaves the nsP1234 polyprotein replication complex into individual functional units. The N-terminal segment of papain is structurally identical with the Chikungunya virus-nsP2 protease. Hence, molecular dynamics simulations were performed to compare molecular mechanism of these proteases. The Chikungunya virus-snP2 protease shows more conformational changes and adopts an alternate conformation. However, N-terminal segment of these two proteases has identical active site scaffold with the conserved catalytic diad. Hence, some of the non-peptide inhibitors of papain were used for induced fit docking at the active site of the nsP2 to assess the binding mode. In addition, the peptides that connect different domains/protein in Chikungunya virus poly-protein were also subjected for docking. The overall results suggest that the active site scaffold is the same in both the proteases and a possibility exists to experimentally assess the efficacy of some of the papain inhibitors to inhibit the Chikungunya virus-nsP2.

  9. A Basic Protein Comparative Three-Dimensional Modeling Methodological Workflow Theory and Practice.

    PubMed

    Bitar, Mainá; Franco, Glória Regina

    2014-01-01

    When working with proteins and studying its properties, it is crucial to have access to the three-dimensional structure of the molecule. If experimentally solved structures are not available, comparative modeling techniques can be used to generate useful protein models to subsidize structure-based research projects. In recent years, with Bioinformatics becoming the basis for the study of protein structures, there is a crescent need for the exposure of details about the algorithms behind the softwares and servers, as well as a need for protocols to guide in silico predictive experiments. In this article, we explore different steps of the comparative modeling technique, such as template identification, sequence alignment, generation of candidate structures and quality assessment, its peculiarities and theoretical description. We then present a practical step-by-step workflow, to support the Biologist on the in silico generation of protein structures. Finally, we explore further steps on comparative modeling, presenting perspectives to the study of protein structures through Bioinformatics. We trust that this is a thorough guide for beginners that wish to work on the comparative modeling of proteins.

  10. Comparative Study of Structural Damage Under Irradiation in SiC Nano-structured and Conventional Ceramics

    SciTech Connect

    Leconte, Yann; Herlin-Boime, Nathalie; Reynaud, Cecile; Thome, Lionel

    2008-07-01

    In the context of research on new materials for next generation nuclear reactors, it becomes more and more interesting to know what can be the advantages of nano-structured materials for such applications. In this study, we performed irradiation experiments on micro-structured and nano-structured {beta}-SiC samples, with 95 MeV Xe and 4 MeV Au ions. The structure of the samples was characterized before and after irradiation by grazing incidence X-ray diffraction and Raman spectroscopy. The results showed the occurrence of a synergy between electronic and nuclear energy loss in both samples with 95 MeV Xe ions, while the nano-structured pellet was found to have a better resistance to the irradiation with 4 MeV Au ions. (authors)

  11. [Sequence analysis for genes encoding nucleoprotein and envelope protein of a new human coronavirus NL63 identified from a pediatric patient in Beijing by bioinformatics].

    PubMed

    Xing, Jiang-feng; Zhu, Ru-nan; Qian, Yuan; Zhao, Lin-qing; Deng, Jie; Wang, Fang; Sun, Yu

    2007-07-01

    The aim of this study was to characterize the N and E protein encoding genes of a new human coronavirus (HCoV-NL63) which was identified from one of the clinical specimens (BJ8081) collected from a 12 years-old patient with acute respiratory infection in Beijing. The complete N and E gene sequences of HCoV-NL63 were amplified from clinical sample by RT-PCR, then were cloned into the pCF-T and pUCm-T vectors respectively and sequenced. The complete sequences of N and E genes were submitted to GenBank by Sequin and compared with N and E genes of prototype HCoV-NL63 and the other coronaviruses published in GenBank. The secondary structure and the characteristics of sample BJ8081 N and E proteins were predicted by bioinformatics. It was indicated that the N and E genes amplified from sample BJ8081 were 1134 bp and 234 bp in length and the predicted proteins including 377 amino acids and 77 amino acids, respectively. The data suggested that the region of amino acids 78-85 within N protein probably was the conserved region for all coronaviruses identified so far including HCoV-NL63. The region of amino acids 15-37 for E protein was probably the transmembrane domain. In conclusion, the recombinant plasmids pCF-T-8081 N and pUCm-T-8081 E were successfully constructed and sequenced, and the data predicted by bioinformatics are helpful for the further analysis of HCoV-NL63.

  12. Comparing Coarray Fortran (CAF) with MPI for several structured mesh PDE applications

    NASA Astrophysics Data System (ADS)

    Garain, Sudip; Balsara, Dinshaw S.; Reid, John

    2015-09-01

    Language-based approaches to parallelism have been incorporated into the Fortran standard. These Fortran extensions go under the name of Coarray Fortran (CAF) and full-featured compilers that support CAF have become available from Cray and Intel; the GNU implementation is expected in 2015. CAF combines elegance of expression with simplicity of implementation to yield an efficient parallel programming language. Elegance of expression results in very compact parallel code. The existence of a standard helps with portability and maintainability. CAF was designed to excel at one-sided communication and similar functions that support one-sided communication are also available in the recent MPI-3 standard. One-sided communication is expected to be very valuable for structured mesh applications involving partial differential equations, amongst other possible applications. This paper focuses on a comparison of CAF and MPI for a few very useful applications areas that are routinely used for solving partial differential equations on structured meshes. The three specific areas are Fast Fourier Techniques, Computational Fluid Dynamics, and Multigrid Methods. For each of those applications areas, we have developed optimized CAF code and optimized MPI code that is based on the one-sided messaging capabilities of MPI-3. Weak scalability studies that compare CAF and MPI-3 are presented on up to 65,536 processors. Both paradigms scale well, showing that they are well-suited for Petascale-class applications. Some of the applications shown (like Fast Fourier Techniques and Computational Fluid Dynamics) require large, coarse-grained messaging. Such applications emphasize high bandwidth. Our other application (Multigrid Methods) uses pointwise smoothers which require a large amount of fine-grained messaging. In such applications, a premium is placed on low latency. Our studies show that both CAF and MPI-3 offer the twin advantages of high bandwidth and low latency for messages of all

  13. Composable languages for bioinformatics: the NYoSh experiment.

    PubMed

    Simi, Manuele; Campagne, Fabien

    2014-01-01

    Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is

  14. Composable languages for bioinformatics: the NYoSh experiment

    PubMed Central

    Simi, Manuele

    2014-01-01

    Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is

  15. Comparative study of porous limestones used in heritage structures in Cyprus and in Hungary

    NASA Astrophysics Data System (ADS)

    Theodoridou, Magdalini; Ioannou, Ioannis; Rozgonyi-Boissinot, Nikoletta; Török, Ákos

    2015-04-01

    Porous limestone is widely used as construction material in the monuments of Cyprus and Hungary. The present study compares the physical properties of a bioclastic limestone from Cyprus and an oolitic limestone from Hungary. Petra Gerolakkou is a Pliocene limestone from Cyprus that originates from the district of Nicosia, the island's capital. It has been extensively used throughout the years in construction and restoration projects, particularly in the Nicosia area. Distinctive examples of its use can be found in the majority of the most important historic monuments in Nicosia, such as the Venetian walls and fortifications, churches (e.g. the Agia Sofia Cathedral), the archbishop and presidential palaces and a high number of other traditional buildings. The studied Miocene limestone from Hungary was exploited from Sóskút quarry (15-20 km W-SW to Budapest). The quarry provided stone for emblematic monuments of the capital of Hungary such as the Parliament building, Mathias Church, the Opera House and Citadella. In this study, mechanical parameters for both aforementioned stones, such as uniaxial compressive and tensile strengths, were tested under laboratory conditions. Their density, porosity and water absorption were also compared. The studied limestone from Cyprus exhibits porosity values within the range of 48-51%, apparent density between 1340 and 1400 kg/m3 and strength values under uniaxial compressive load between 1.2 and 2.8 MPa. This lithotype is also considered susceptible to salt decay, since an approximate mass loss of 12.5% is noted after 15 salt crystallization artificial weathering cycles. The porosity of the Hungarian limestone is in the order of 16-35%, the bulk density is 1600-1950 kg/m3, while the compressive strength is 2.5-15 MPa. Durability tests indicate that even after 10 freeze-thaw cycles the loss in strength is dramatic. Test results indicate that use of porous limestone in both countries is common and fabric strongly controls the

  16. The mechanical properties of various chemical vapor deposition diamond structures compared to the ideal single crystal

    NASA Astrophysics Data System (ADS)

    Hess, Peter

    2012-03-01

    The structural and electronic properties of the diamond lattice, leading to its outstanding mechanical properties, are discussed. These include the highest elastic moduli and fracture strength of any known material. Its extreme hardness is strongly connected with the extreme shear modulus, which even exceeds the large bulk modulus, revealing that diamond is more resistant to shear deformation than to volume changes. These unique features protect the ideal diamond lattice also against mechanical failure and fracture. Besides fast heat conduction, the fast vibrational movement of carbon atoms results in an extreme speed of sound and propagation of crack tips with comparable velocity. The ideal mechanical properties are compared with those of real diamond films, plates, and crystals, such as ultrananocrystalline (UNC), nanocrystalline, microcrystalline, and homo- and heteroepitaxial single-crystal chemical vapor deposition (CVD) diamond, produced by metastable synthesis using CVD. Ultrasonic methods have played and continue to play a dominant role in the determination of the linear elastic properties, such as elastic moduli of crystals or the Young's modulus of thin films with substantially varying impurity levels and morphologies. A surprising result of these extensive measurements is that even UNC diamond may approach the extreme Young's modulus of single-crystal diamond under optimized deposition conditions. The physical reasons for why the stiffness often deviates by no more than a factor of two from the ideal value are discussed, keeping in mind the large variety of diamond materials grown by various deposition conditions. Diamond is also known for its extreme hardness and fracture strength, despite its brittle nature. However, even for the best natural and synthetic diamond crystals, the measured critical fracture stress is one to two orders of magnitude smaller than the ideal value obtained by ab initio calculations for the ideal cubic lattice. Currently

  17. Comparing the effects of uniaxial and biaxial strains on the structural stability and electronic structure in wurtzite ZnS

    NASA Astrophysics Data System (ADS)

    Lv, Dong; Duan, Yifeng; Zhao, Botao; Qin, Lixia; Shi, Liwei; Tang, Gang; Shi, Hongliang

    2013-07-01

    Structural stability and electronic structure of wurtzite ZnS under uniaxial and biaxial strains are systematically studied using the HSE hybrid functional. The two types of strain display the markedly different influences on the structural and electronic properties: (I) The newly predicted graphite-like phase is observed at large compressive uniaxial strains, not at large tensile biaxial strains, which is attributed to the different elastic responses to uniaxial and biaxial strains. (II) The direct band structures are obtained in wurtzite ZnS under uniaxial and biaxial strains, whereas the indirect band gaps are only observed in graphite-like ZnS under large uniaxial strain. Our results are different from the widely accepted conclusion but are in good agreement with the available experimental data.

  18. Medical libraries, bioinformatics, and networked information: a coming convergence?

    PubMed Central

    Lynch, C

    1999-01-01

    Libraries will be changed by technological and social developments that are fueled by information technology, bioinformatics, and networked information. Libraries in highly focused settings such as the health sciences are at a pivotal point in their development as the synthesis of historically diverse and independent information sources transforms health care institutions. Boundaries are breaking down between published literature and research data, between research databases and clinical patient data, and between consumer health information and professional literature. This paper focuses on the dynamics that are occurring with networked information sources and the roles that libraries will need to play in the world of medical informatics in the early twenty-first century. PMID:10550026

  19. SOAP-based services provided by the European Bioinformatics Institute

    PubMed Central

    Pillai, S.; Silventoinen, V.; Kallio, K.; Senger, M.; Sobhany, S.; Tate, J.; Velankar, S.; Golovin, A.; Henrick, K.; Rice, P.; Stoehr, P.; Lopez, R.

    2005-01-01

    SOAP (Simple Object Access Protocol) () based Web Services technology () has gained much attention as an open standard enabling interoperability among applications across heterogeneous architectures and different networks. The European Bioinformatics Institute (EBI) is using this technology to provide robust data retrieval and data analysis mechanisms to the scientific community and to enhance utilization of the biological resources it already provides [N. Harte, V. Silventoinen, E. Quevillon, S. Robinson, K. Kallio, X. Fustero, P. Patel, P. Jokinen and R. Lopez (2004) Nucleic Acids Res., 32, 3–9]. These services are available free to all users from . PMID:15980463

  20. Quantifying optimal accuracy of local primary sequence bioinformatics methods

    NASA Astrophysics Data System (ADS)

    Aalberts, Daniel

    2005-03-01

    Traditional bioinformatics methods scan primary sequences for local patterns. It is important to assess how accurate local primary sequence methods can be. We study the problem of donor pre-mRNA splice site recognition, where the sequence overlaps between real and decoy data sets can be quantified, exposing the intrinsic limitations of the performance of local primary sequence methods. We assess the accuracy of local primary sequence methods generally by studying how they scale with dataset size and demonstrate that our new Primary Sequence Ranking methods have superior performance. Our Primary Sequence Ranking analysis tools are available at tt http://rna.williams.edu/

  1. The bioinformatics of microarrays to study cancer: Advantages and disadvantages

    NASA Astrophysics Data System (ADS)

    Rodríguez-Segura, M. A.; Godina-Nava, J. J.; Villa-Treviño, S.

    2012-10-01

    Microarrays are devices designed to analyze simultaneous expression of thousands of genes. However, the process will adds noise into the information at each stage of the study. To analyze these thousands of data is necessary to use bioinformatics tools. The traditional analysis begins by normalizing data, but the obtained results are highly dependent on how it is conducted the study. It is shown the need to develop new strategies to analyze microarray. Liver tissue taken from an animal model in which is chemically induced cancer is used as an example.

  2. Biowep: a workflow enactment portal for bioinformatics applications

    PubMed Central

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-01-01

    Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis

  3. Comparative gene expression analysis of avian embryonic facial structures reveals new candidates for human craniofacial disorders.

    PubMed

    Brugmann, S A; Powder, K E; Young, N M; Goodnough, L H; Hahn, S M; James, A W; Helms, J A; Lovett, M

    2010-03-01

    Mammals and birds have common embryological facial structures, and appear to employ the same molecular genetic developmental toolkit. We utilized natural variation found in bird beaks to investigate what genes drive vertebrate facial morphogenesis. We employed cross-species microarrays to describe the molecular genetic signatures, developmental signaling pathways and the spectrum of transcription factor (TF) gene expression changes that differ between cranial neural crest cells in the developing beaks of ducks, quails and chickens. Surprisingly, we observed that the neural crest cells established a species-specific TF gene expression profile that predates morphological differences between the species. A total of 232 genes were differentially expressed between the three species. Twenty-two of these genes, including Fgfr2, Jagged2, Msx2, Satb2 and Tgfb3, have been previously implicated in a variety of mammalian craniofacial defects. Seventy-two of the differentially expressed genes overlap with un-cloned loci for human craniofacial disorders, suggesting that our data will provide a valuable candidate gene resource for human craniofacial genetics. The most dramatic changes between species were in the Wnt signaling pathway, including a 20-fold up-regulation of Dkk2, Fzd1 and Wnt1 in the duck compared with the other two species. We functionally validated these changes by demonstrating that spatial domains of Wnt activity differ in avian beaks, and that Wnt signals regulate Bmp pathway activity and promote regional growth in facial prominences. This study is the first of its kind, extending on previous work in Darwin's finches and provides the first large-scale insights into cross-species facial morphogenesis.

  4. Comparative study of fermentation and methanogen community structure in the digestive tract of goats and rabbits.

    PubMed

    Abecia, L; Fondevila, M; Rodríguez-Romero, N; Martínez, G; Yáñez-Ruiz, D R

    2013-05-01

    Methane is the most important anthropogenic contribution to climate change after carbon dioxide and represents a loss of feed energy for the animal, mainly for herbivorous species. However, our knowledge about the ecology of Archaea, the microbial group responsible for methane synthesis in the gut, is very poor. Moreover, it is well known that hindgut fermentation differs from rumen fermentation. The composition of archaeal communities in fermentation compartments of goats and rabbits were investigated using DGGE to generate fingerprints of archaeal 16S rRNA gene. Ruminal contents and faeces from five Murciano-Granadina goats and caecal contents of five commercial White New Zealand rabbits were compared. Diversity profile of methanogenic archaea was carried out by PCR-DGGE. Quantification of methanogenic archaea and the abundance relative to bacteria was determined by real-time PCR. Methanogenic archaeal species were relatively constant across species. Dendrogram from DGGE of the methanogen community showed one cluster for goat samples with two sub-clusters by type of sample (ruminal and faeces). In a second cluster, samples from rabbit were grouped. No differences were found either in richness or Shannon index as diversity indexes. Although the primer sets used was developed to investigate rumen methanogenic archaeal community, primers specificity did not affect the assessment of rabbit methanogen community structure. Rumen content showed the highest number or methanogenic archaea (log₁₀ 9.36), followed by faeces (log₁₀ 8.52) and showing rabbit caecum the lower values (log₁₀ 5.52). DGGE profile showed that pre-gastric and hindgut fermenters hold a very different methanogen community. Rabbits hold a microbial community of similar complexity than that in ruminants but less abundant, which agrees with the type of fermentation profile.

  5. Tidal breathing patterns derived from structured light plethysmography in COPD patients compared with healthy subjects

    PubMed Central

    Motamedi-Fakhr, Shayan; Wilson, Rachel C; Iles, Richard

    2017-01-01

    Purpose Differences in tidal breathing patterns have been reported between patients with chronic obstructive pulmonary disease (COPD) and healthy individuals using traditional measurement techniques. This feasibility study examined whether structured light plethysmography (SLP) – a noncontact, light-based technique – could also detect differences in tidal breathing patterns between patients with COPD and healthy subjects. Patients and methods A 5 min period of tidal (quiet) breathing was recorded in each patient with COPD (n=31) and each healthy subject (n=31), matched for age, body mass index, and sex. For every participant, the median and interquartile range (IQR; denoting within-subject variability) of 12 tidal breathing parameters were calculated. Individual data were then combined by cohort and summarized by its median and IQR. Results After correction for multiple comparisons, inspiratory time (median tI) and its variability (IQR of tI) were lower in patients with COPD (p<0.001 and p<0.01, respectively) as were ratios derived from tI (tI/tE and tI/tTot, both p<0.01) and their variability (p<0.01 and p<0.05, respectively). IE50SLP (the ratio of inspiratory to expiratory flow at 50% tidal volume calculated from the SLP signal) was higher (p<0.001) in COPD while SLP-derived time to reach peak tidal expiratory flow over expiratory time (median tPTEFSLP/tE) was shorter (p<0.01) and considerably less variable (p<0.001). Thoraco–abdominal asynchrony was increased (p<0.05) in COPD. Conclusion These early observations suggest that, like traditional techniques, SLP is able to detect different breathing patterns in COPD patients compared with subjects with no respiratory disease. This provides support for further investigation into the potential uses of SLP in assessing clinical conditions and interventions. PMID:28096696

  6. Dynamic partial reconfiguration implementation of the SVM/KNN multi-classifier on FPGA for bioinformatics application.

    PubMed

    Hussain, Hanaa M; Benkrid, Khaled; Seker, Huseyin

    2015-01-01

    Bioinformatics data tend to be highly dimensional in nature thus impose significant computational demands. To resolve limitations of conventional computing methods, several alternative high performance computing solutions have been proposed by scientists such as Graphical Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The latter have shown to be efficient and high in performance. In recent years, FPGAs have been benefiting from dynamic partial reconfiguration (DPR) feature for adding flexibility to alter specific regions within the chip. This work proposes combing the use of FPGAs and DPR to build a dynamic multi-classifier architecture that can be used in processing bioinformatics data. In bioinformatics, applying different classification algorithms to the same dataset is desirable in order to obtain comparable, more reliable and consensus decision, but it can consume long time when performed on conventional PC. The DPR implementation of two common classifiers, namely support vector machines (SVMs) and K-nearest neighbor (KNN) are combined together to form a multi-classifier FPGA architecture which can utilize specific region of the FPGA to work as either SVM or KNN classifier. This multi-classifier DPR implementation achieved at least ~8x reduction in reconfiguration time over the single non-DPR classifier implementation, and occupied less space and hardware resources than having both classifiers. The proposed architecture can be extended to work as an ensemble classifier.

  7. Technosciences in Academia: Rethinking a Conceptual Framework for Bioinformatics Undergraduate Curricula

    NASA Astrophysics Data System (ADS)

    Symeonidis, Iphigenia Sofia

    This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.

  8. The World-Wide Web: an interface between research and teaching in bioinformatics.

    PubMed

    Aiton, J F

    1994-10-01

    The rapid expansion occurring in World-Wide Web activity is beginning to make the concepts of 'global hypermedia' and 'universal document readership realistic objectives of the new revolution in information technology. One consequence of this increase in usage is that educators and students are becoming more aware of the diversity of the knowledge base which can be accessed via the Internet. Although computerised databases and information services have long played a key role in bioinformatics these same resources can also be used to provide core materials for teaching and learning. The large datasets and archives that have been compiled for biomedical research can be enhanced with the addition of a variety of multimedia elements (images, digital videos, animation etc.). The use of this digitally stored information in structured and self-directed learning environments is likely to increase as activity across World-Wide Web increases.

  9. Meeting Review: Bioinformatics and Medicine – From Molecules to Humans, Virtual and Real

    PubMed Central

    2002-01-01

    The Industrialization Workshop Series aims to promote and discuss integration, automation, simulation, quality, availability and standards in the high-throughput life sciences. The main issues addressed being the transformation of bioinformatics and bioinformaticsbased drug design into a robust discipline in industry, the government, research institutes and academia. The latest workshop emphasized the influence of the post-genomic era on medicine and healthcare with reference to advanced biological systems modeling and simulation, protein structure research, protein-protein interactions, metabolism and physiology. Speakers included Michael Ashburner, Kenneth Buetow, Francois Cambien, Cyrus Chothia, Jean Garnier, Francois Iris, Matthias Mann, Maya Natarajan, Peter Murray-Rust, Richard Mushlin, Barry Robson, David Rubin, Kosta Steliou, John Todd, Janet Thornton, Pim van der Eijk, Michael Vieth and Richard Ward. PMID:18628854

  10. ADN-Viewer: a 3D approach for bioinformatic analyses of large DNA sequences.

    PubMed

    Hérisson, Joan; Ferey, Nicolas; Gros, Pierre-Emmanuel; Gherbi, Rachid

    2007-01-20

    Most of biologists work on textual DNA sequences that are limited to the linear representation of DNA. In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of naked DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADNViewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration. In addition, one bioinformatics study scenario is proposed.

  11. Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions

    PubMed Central

    2014-01-01

    Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads. PMID:24428920

  12. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    PubMed

    Ju, Feng; Zhang, Tong

    2015-11-03

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation.

  13. MOWServ: a web client for integration of bioinformatic resources.

    PubMed

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J; Claros, M Gonzalo; Trelles, Oswaldo

    2010-07-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user's tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/.

  14. MOWServ: a web client for integration of bioinformatic resources

    PubMed Central

    Ramírez, Sergio; Muñoz-Mérida, Antonio; Karlsson, Johan; García, Maximiliano; Pérez-Pulido, Antonio J.; Claros, M. Gonzalo; Trelles, Oswaldo

    2010-01-01

    The productivity of any scientist is affected by cumbersome, tedious and time-consuming tasks that try to make the heterogeneous web services compatible so that they can be useful in their research. MOWServ, the bioinformatic platform offered by the Spanish National Institute of Bioinformatics, was released to provide integrated access to databases and analytical tools. Since its release, the number of available services has grown dramatically, and it has become one of the main contributors of registered services in the EMBRACE Biocatalogue. The ontology that enables most of the web-service compatibility has been curated, improved and extended. The service discovery has been greatly enhanced by Magallanes software and biodataSF. User data are securely stored on the main server by an authentication protocol that enables the monitoring of current or already-finished user’s tasks, as well as the pipelining of successive data processing services. The BioMoby standard has been greatly extended with the new features included in the MOWServ, such as management of additional information (metadata such as extended descriptions, keywords and datafile examples), a qualified registry, error handling, asynchronous services and service replication. All of them have increased the MOWServ service quality, usability and robustness. MOWServ is available at http://www.inab.org/MOWServ/ and has a mirror at http://www.bitlab-es.com/MOWServ/. PMID:20525794

  15. Web services at the European Bioinformatics Institute-2009

    PubMed Central

    Mcwilliam, Hamish; Valentin, Franck; Goujon, Mickael; Li, Weizhong; Narayanasamy, Menaka; Martin, Jenny; Miyar, Teresa; Lopez, Rodrigo

    2009-01-01

    The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition to the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress. These APIs are based on Web Services (SOAP/REST) interfaces that allow users to systematically access databases and analytical tools. From the user's point of view, these Web Services provide the same functionality as the browser-based forms. However, using the APIs frees the user from web page constraints and are ideal for the analysis of large batches of data, performing text-mining tasks and the casual or systematic evaluation of mathematical models in regulatory networks. Furthermore, these services are widespread and easy to use; require no prior knowledge of the technology and no more than basic experience in programming. In the following we wish to inform of new and updated services as well as briefly describe planned developments to be made available during the course of 2009–2010. PMID:19435877

  16. Confirming the Factor Structure of the Cognitive Test Anxiety Scale: Comparing the Utility of Three Solutions

    ERIC Educational Resources Information Center

    Cassady, Jerrell C.; Finch, W. Holmes

    2014-01-01

    This study validated the factor structure of a popular assessment of learner's cognitive test anxiety. Following recent findings in a study with Argentinean students' use of the Spanish version of the Cognitive Test Anxiety Scale (CTAS), this study tested the factor structure using data from 742 students who completed the original English version…

  17. A comparative study of structures and structural transitions of secondary transporters with the LeuT fold.

    PubMed

    Jeschke, Gunnar

    2013-03-01

    Secondary active transporters from several protein families share a core of two five-helix inverted repeats that has become known as the LeuT fold. The known high-resolution protein structures with this fold were analyzed by structural superposition of the core transmembrane domains (TMDs). Three angle parameters derived from the mean TMD axes correlate with accessibility of the central binding site from the outside or inside. Structural transitions between distinct conformations were analyzed for four proteins in terms of changes in relative TMD arrangement and in internal conformation of TMDs. Collectively moving groups of TMDs were found to be correlated in the covariance matrix of elastic network models. The main features of the structural transitions can be reproduced with the 5 % slowest normal modes of anisotropic elastic network models. These results support the rocking bundle model for the major conformational change between the outward- and inward-facing states of the protein and point to an important role for the independently moving last TMDs of each repeat in occluding access to the central binding site. Occlusion is also supported by flexing of some individual TMDs in the collectively moving bundle and hash motifs.

  18. Comparative assessment of bone mass and structure using texture-based and histomorphometric analyses

    PubMed Central

    Xiang, Yongqing; Yingling, Vanessa R.; Malique, Rumena; Li, Chao Yang; Schaffler, Mitchell B.; Raphan, Theodore

    2013-01-01

    The purpose of this study was to develop a methodology for quantitatively assessing bone quantity and anisotropy based on texture analysis using Gabor wavelets. The wavelet approach has the capability to simultaneously examine the images at low and high resolutions to gain information on both global and detailed local features of the bone image. The program that implemented the texture analysis gave measures of density (MDensity) and anisotropy (MAnisotropy). It also allowed us to examine the texture energy at four orientations (0°, 45°, 90°, 135°) to gain insight about the details of the anisotropy. Analysis of templates of four simulated patterns, which had same number of dots but with differing orientations, demonstrated how the texture-based analysis differentiated between these templates. The measures of MAnisotropy discriminated between the four simulated patterns. The MDensity measures were similar across all patterns. These outcomes matched the design intent of the simulated patterns. We also compared the trabecular bone images obtained from a previous study, in which the right forelimbs of normal female retired breeder beagle dogs (5–7 years old) were cast for 12 months to induce bone loss, using both histomorphometry and texture analysis. Both histomorphometry and the texture analysis detected significant differences in the trabecular bone of the distal metatarsal between the control and disuse groups. Percent trabecular bone (Tb.Ar/T.Ar) and the textural density parameter (MDensity) were highly correlated (r =0.962). MAnisotropy was decreased (3.9%) after the 12-month disuse protocol, but was not significantly different from normal. However, the texture energy values at all orientations (0°, 45°, 90° and 135°) were significantly decreased in the disuse group. Therefore, texture analysis was able to assess anisotropy, which could not be extracted from histomorphometric parameters. We conclude that texture analysis is an effective tool for

  19. Comparative structural analysis of Bru1 region homeologs in Saccharum spontaneum and S. officinarum

    DOE PAGES

    Zhang, Jisen; Sharma, Anupma; Yu, Qingyi; ...

    2016-06-10

    Here, sugarcane is a major sugar and biofuel crop, but genomic research and molecular breeding have lagged behind other major crops due to the complexity of auto-allopolyploid genomes. Sugarcane cultivars are frequently aneuploid with chromosome number ranging from 100 to 130, consisting of 70-80 % S. officinarum, 10-20 % S. spontaneum, and 10 % recombinants between these two species. Analysis of a genomic region in the progenitor autoploid genomes of sugarcane hybrid cultivars will reveal the nature and divergence of homologous chromosomes. As a result, to investigate the origin and evolution of haplotypes in the Bru1 genomic regions in sugarcanemore » cultivars, we identified two BAC clones from S. spontaneum and four from S. officinarum and compared to seven haplotype sequences from sugarcane hybrid R570. The results clarified the origin of seven homologous haplotypes in R570, four haplotypes originated from S. officinarum, two from S. spontaneum and one recombinant.. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence ranged from 18.2 % to 60.5 % with an average of 33. 7 %. Gene content and gene structure were relatively well conserved among the homologous haplotypes. Exon splitting occurred in haplotypes of the hybrid genome but not in its progenitor genomes. Tajima's D analysis revealed that S. spontaneum hapotypes in the Bru1 genomic regions were under strong directional selection. Numerous inversions, deletions, insertions and translocations were found between haplotypes within each genome. In conclusion, this is the first comparison among haplotypes of a modern sugarcane hybrid and its two progenitors. Tajima's D results emphasized the crucial role of this fungal disease resistance gene for enhancing the fitness of this species and indicating that the brown rust resistance gene in R570 is from S. spontaneum. Species-specific InDel, sequences similarity and phylogenetic analysis of homologous genes can

  20. A Comparative Taxonomy of Parallel Algorithms for RNA Secondary Structure Prediction

    PubMed Central

    Al-Khatib, Ra’ed M.; Abdullah, Rosni; Rashid, Nur’Aini Abdul

    2010-01-01

    RNA molecules have been discovered playing crucial roles in numerous biological and medical procedures and processes. RNA structures determination have become a major problem in the biology context. Recently, computer scientists have empowered the biologists with RNA secondary structures that ease an understanding of the RNA functions and roles. Detecting RNA secondary structure is an NP-hard problem, especially in pseudoknotted RNA structures. The detection process is also time-consuming; as a result, an alternative approach such as using parallel architectures is a desirable option. The main goal in this paper is to do an intensive investigation of parallel methods used in the literature to solve the demanding issues, related to the RNA secondary structure prediction methods. Then, we introduce a new taxonomy for the parallel RNA folding methods. Based on this proposed taxonomy, a systematic and scientific comparison is performed among these existing methods. PMID:20458364