OpenHelix: bioinformatics education outside of a different box.
Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C
2010-11-01
The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review.
OpenHelix: bioinformatics education outside of a different box
Mangan, Mary E.; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C.
2010-01-01
The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181
Honts, Jerry E.
2003-01-01
Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489
Development of a cloud-based Bioinformatics Training Platform.
Revote, Jerico; Watson-Haigh, Nathan S; Quenette, Steve; Bethwaite, Blair; McGrath, Annette; Shang, Catherine A
2017-05-01
The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. © The Author 2016. Published by Oxford University Press.
Development of a cloud-based Bioinformatics Training Platform
Revote, Jerico; Watson-Haigh, Nathan S.; Quenette, Steve; Bethwaite, Blair; McGrath, Annette
2017-01-01
Abstract The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. PMID:27084333
Integer Linear Programming in Computational Biology
NASA Astrophysics Data System (ADS)
Althaus, Ernst; Klau, Gunnar W.; Kohlbacher, Oliver; Lenhof, Hans-Peter; Reinert, Knut
Computational molecular biology (bioinformatics) is a young research field that is rich in NP-hard optimization problems. The problem instances encountered are often huge and comprise thousands of variables. Since their introduction into the field of bioinformatics in 1997, integer linear programming (ILP) techniques have been successfully applied to many optimization problems. These approaches have added much momentum to development and progress in related areas. In particular, ILP-based approaches have become a standard optimization technique in bioinformatics. In this review, we present applications of ILP-based techniques developed by members and former members of Kurt Mehlhorn’s group. These techniques were introduced to bioinformatics in a series of papers and popularized by demonstration of their effectiveness and potential.
ERIC Educational Resources Information Center
Kovarik, Dina N.; Patterson, Davis G.; Cohen, Carolyn; Sanders, Elizabeth A.; Peterson, Karen A.; Porter, Sandra G.; Chowning, Jeanne Ting
2013-01-01
We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The…
Agyei, Dominic; Tsopmo, Apollinaire; Udenigwe, Chibuike C
2018-06-01
There are emerging advancements in the strategies used for the discovery and development of food-derived bioactive peptides because of their multiple food and health applications. Bioinformatics and peptidomics are two computational and analytical techniques that have the potential to speed up the development of bioactive peptides from bench to market. Structure-activity relationships observed in peptides form the basis for bioinformatics and in silico prediction of bioactive sequences encrypted in food proteins. Peptidomics, on the other hand, relies on "hyphenated" (liquid chromatography-mass spectrometry-based) techniques for the detection, profiling, and quantitation of peptides. Together, bioinformatics and peptidomics approaches provide a low-cost and effective means of predicting, profiling, and screening bioactive protein hydrolysates and peptides from food. This article discuses the basis, strengths, and limitations of bioinformatics and peptidomics approaches currently used for the discovery and analysis of food-derived bioactive peptides.
Evolving from bioinformatics in-the-small to bioinformatics in-the-large.
Parker, D Stott; Gorlick, Michael M; Lee, Christopher J
2003-01-01
We argue the significance of a fundamental shift in bioinformatics, from in-the-small to in-the-large. Adopting a large-scale perspective is a way to manage the problems endemic to the world of the small-constellations of incompatible tools for which the effort required to assemble an integrated system exceeds the perceived benefit of the integration. Where bioinformatics in-the-small is about data and tools, bioinformatics in-the-large is about metadata and dependencies. Dependencies represent the complexities of large-scale integration, including the requirements and assumptions governing the composition of tools. The popular make utility is a very effective system for defining and maintaining simple dependencies, and it offers a number of insights about the essence of bioinformatics in-the-large. Keeping an in-the-large perspective has been very useful to us in large bioinformatics projects. We give two fairly different examples, and extract lessons from them showing how it has helped. These examples both suggest the benefit of explicitly defining and managing knowledge flows and knowledge maps (which represent metadata regarding types, flows, and dependencies), and also suggest approaches for developing bioinformatics database systems. Generally, we argue that large-scale engineering principles can be successfully adapted from disciplines such as software engineering and data management, and that having an in-the-large perspective will be a key advantage in the next phase of bioinformatics development.
H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa
Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu
2016-01-01
The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985
Bonnal, Raoul J P; Aerts, Jan; Githinji, George; Goto, Naohisa; MacLean, Dan; Miller, Chase A; Mishima, Hiroyuki; Pagani, Massimiliano; Ramirez-Gonzalez, Ricardo; Smant, Geert; Strozzi, Francesco; Syme, Rob; Vos, Rutger; Wennblom, Trevor J; Woodcroft, Ben J; Katayama, Toshiaki; Prins, Pjotr
2012-04-01
Biogem provides a software development environment for the Ruby programming language, which encourages community-based software development for bioinformatics while lowering the barrier to entry and encouraging best practices. Biogem, with its targeted modular and decentralized approach, software generator, tools and tight web integration, is an improved general model for scaling up collaborative open source software development in bioinformatics. Biogem and modules are free and are OSS. Biogem runs on all systems that support recent versions of Ruby, including Linux, Mac OS X and Windows. Further information at http://www.biogems.info. A tutorial is available at http://www.biogems.info/howto.html bonnal@ingm.org.
Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari
2014-01-01
Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484
Assessing an effective undergraduate module teaching applied bioinformatics to biology students
2018-01-01
Applied bioinformatics skills are becoming ever more indispensable for biologists, yet incorporation of these skills into the undergraduate biology curriculum is lagging behind, in part due to a lack of instructors willing and able to teach basic bioinformatics in classes that don’t specifically focus on quantitative skill development, such as statistics or computer sciences. To help undergraduate course instructors who themselves did not learn bioinformatics as part of their own education and are hesitant to plunge into teaching big data analysis, a module was developed that is written in plain-enough language, using publicly available computing tools and data, to allow novice instructors to teach next-generation sequence analysis to upper-level undergraduate students. To determine if the module allowed students to develop a better understanding of and appreciation for applied bioinformatics, various tools were developed and employed to assess the impact of the module. This article describes both the module and its assessment. Students found the activity valuable for their education and, in focus group discussions, emphasized that they saw a need for more and earlier instruction of big data analysis as part of the undergraduate biology curriculum. PMID:29324777
Scalability and Validation of Big Data Bioinformatics Software.
Yang, Andrian; Troup, Michael; Ho, Joshua W K
2017-01-01
This review examines two important aspects that are central to modern big data bioinformatics analysis - software scalability and validity. We argue that not only are the issues of scalability and validation common to all big data bioinformatics analyses, they can be tackled by conceptually related methodological approaches, namely divide-and-conquer (scalability) and multiple executions (validation). Scalability is defined as the ability for a program to scale based on workload. It has always been an important consideration when developing bioinformatics algorithms and programs. Nonetheless the surge of volume and variety of biological and biomedical data has posed new challenges. We discuss how modern cloud computing and big data programming frameworks such as MapReduce and Spark are being used to effectively implement divide-and-conquer in a distributed computing environment. Validation of software is another important issue in big data bioinformatics that is often ignored. Software validation is the process of determining whether the program under test fulfils the task for which it was designed. Determining the correctness of the computational output of big data bioinformatics software is especially difficult due to the large input space and complex algorithms involved. We discuss how state-of-the-art software testing techniques that are based on the idea of multiple executions, such as metamorphic testing, can be used to implement an effective bioinformatics quality assurance strategy. We hope this review will raise awareness of these critical issues in bioinformatics.
Yan, Qing
2010-01-01
Bioinformatics is the rational study at an abstract level that can influence the way we understand biomedical facts and the way we apply the biomedical knowledge. Bioinformatics is facing challenges in helping with finding the relationships between genetic structures and functions, analyzing genotype-phenotype associations, and understanding gene-environment interactions at the systems level. One of the most important issues in bioinformatics is data integration. The data integration methods introduced here can be used to organize and integrate both public and in-house data. With the volume of data and the high complexity, computational decision support is essential for integrative transporter studies in pharmacogenomics, nutrigenomics, epigenetics, and systems biology. For the development of such a decision support system, object-oriented (OO) models can be constructed using the Unified Modeling Language (UML). A methodology is developed to build biomedical models at different system levels and construct corresponding UML diagrams, including use case diagrams, class diagrams, and sequence diagrams. By OO modeling using UML, the problems of transporter pharmacogenomics and systems biology can be approached from different angles with a more complete view, which may greatly enhance the efforts in effective drug discovery and development. Bioinformatics resources of membrane transporters and general bioinformatics databases and tools that are frequently used in transporter studies are also collected here. An informatics decision support system based on the models presented here is available at http://www.pharmtao.com/transporter . The methodology developed here can also be used for other biomedical fields.
Rein, Diane C.
2006-01-01
Setting: Purdue University is a major agricultural, engineering, biomedical, and applied life science research institution with an increasing focus on bioinformatics research that spans multiple disciplines and campus academic units. The Purdue University Libraries (PUL) hired a molecular biosciences specialist to discover, engage, and support bioinformatics needs across the campus. Program Components: After an extended period of information needs assessment and environmental scanning, the specialist developed a week of focused bioinformatics instruction (Bioinformatics Week) to launch system-wide, library-based bioinformatics services. Evaluation Mechanisms: The specialist employed a two-tiered approach to assess user information requirements and expectations. The first phase involved careful observation and collection of information needs in-context throughout the campus, attending laboratory meetings, interviewing department chairs and individual researchers, and engaging in strategic planning efforts. Based on the information gathered during the integration phase, several survey instruments were developed to facilitate more critical user assessment and the recovery of quantifiable data prior to planning. Next Steps/Future Directions: Given information gathered while working with clients and through formal needs assessments, as well as the success of instructional approaches used in Bioinformatics Week, the specialist is developing bioinformatics support services for the Purdue community. The specialist is also engaged in training PUL faculty librarians in bioinformatics to provide a sustaining culture of library-based bioinformatics support and understanding of Purdue's bioinformatics-related decision and policy making. PMID:16888666
Rein, Diane C
2006-07-01
Purdue University is a major agricultural, engineering, biomedical, and applied life science research institution with an increasing focus on bioinformatics research that spans multiple disciplines and campus academic units. The Purdue University Libraries (PUL) hired a molecular biosciences specialist to discover, engage, and support bioinformatics needs across the campus. After an extended period of information needs assessment and environmental scanning, the specialist developed a week of focused bioinformatics instruction (Bioinformatics Week) to launch system-wide, library-based bioinformatics services. The specialist employed a two-tiered approach to assess user information requirements and expectations. The first phase involved careful observation and collection of information needs in-context throughout the campus, attending laboratory meetings, interviewing department chairs and individual researchers, and engaging in strategic planning efforts. Based on the information gathered during the integration phase, several survey instruments were developed to facilitate more critical user assessment and the recovery of quantifiable data prior to planning. Given information gathered while working with clients and through formal needs assessments, as well as the success of instructional approaches used in Bioinformatics Week, the specialist is developing bioinformatics support services for the Purdue community. The specialist is also engaged in training PUL faculty librarians in bioinformatics to provide a sustaining culture of library-based bioinformatics support and understanding of Purdue's bioinformatics-related decision and policy making.
Kovarik, Dina N; Patterson, Davis G; Cohen, Carolyn; Sanders, Elizabeth A; Peterson, Karen A; Porter, Sandra G; Chowning, Jeanne Ting
2013-01-01
We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre-post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers.
Kovarik, Dina N.; Patterson, Davis G.; Cohen, Carolyn; Sanders, Elizabeth A.; Peterson, Karen A.; Porter, Sandra G.; Chowning, Jeanne Ting
2013-01-01
We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre–post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers. PMID:24006393
Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource
ERIC Educational Resources Information Center
Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc
2005-01-01
EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…
The 2017 Bioinformatics Open Source Conference (BOSC)
Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Munoz-Torres, Monica; Tzovaras, Bastian Greshake; Wiencko, Heather
2017-01-01
The Bioinformatics Open Source Conference (BOSC) is a meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. The 18th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2017) took place in Prague, Czech Republic in July 2017. The conference brought together nearly 250 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, open and reproducible science, and this year’s theme, open data. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community, called the OBF Codefest. PMID:29118973
The 2017 Bioinformatics Open Source Conference (BOSC).
Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Munoz-Torres, Monica; Tzovaras, Bastian Greshake; Wiencko, Heather
2017-01-01
The Bioinformatics Open Source Conference (BOSC) is a meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. The 18th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2017) took place in Prague, Czech Republic in July 2017. The conference brought together nearly 250 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, open and reproducible science, and this year's theme, open data. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community, called the OBF Codefest.
ERIC Educational Resources Information Center
Almeida, Craig A.; Tardiff, Daniel F.; De Luca, Jane P.
2004-01-01
We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product.…
GLAD: a system for developing and deploying large-scale bioinformatics grid.
Teo, Yong-Meng; Wang, Xianbing; Ng, Yew-Kwong
2005-03-01
Grid computing is used to solve large-scale bioinformatics problems with gigabytes database by distributing the computation across multiple platforms. Until now in developing bioinformatics grid applications, it is extremely tedious to design and implement the component algorithms and parallelization techniques for different classes of problems, and to access remotely located sequence database files of varying formats across the grid. In this study, we propose a grid programming toolkit, GLAD (Grid Life sciences Applications Developer), which facilitates the development and deployment of bioinformatics applications on a grid. GLAD has been developed using ALiCE (Adaptive scaLable Internet-based Computing Engine), a Java-based grid middleware, which exploits the task-based parallelism. Two bioinformatics benchmark applications, such as distributed sequence comparison and distributed progressive multiple sequence alignment, have been developed using GLAD.
The 2016 Bioinformatics Open Source Conference (BOSC).
Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather
2016-01-01
Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.
The 2016 Bioinformatics Open Source Conference (BOSC)
Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather
2016-01-01
Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science. PMID:27781083
Bioinformatics for Exploration
NASA Technical Reports Server (NTRS)
Johnson, Kathy A.
2006-01-01
For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.
Rapid Development of Bioinformatics Education in China
ERIC Educational Resources Information Center
Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang
2003-01-01
As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related…
The 2015 Bioinformatics Open Source Conference (BOSC 2015).
Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica
2016-02-01
The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.
Bioinformatics education in India.
Kulkarni-Kale, Urmila; Sawant, Sangeeta; Chavan, Vishwas
2010-11-01
An account of bioinformatics education in India is presented along with future prospects. Establishment of BTIS network by Department of Biotechnology (DBT), Government of India in the 1980s had been a systematic effort in the development of bioinformatics infrastructure in India to provide services to scientific community. Advances in the field of bioinformatics underpinned the need for well-trained professionals with skills in information technology and biotechnology. As a result, programmes for capacity building in terms of human resource development were initiated. Educational programmes gradually evolved from the organisation of short-term workshops to the institution of formal diploma/degree programmes. A case study of the Master's degree course offered at the Bioinformatics Centre, University of Pune is discussed. Currently, many universities and institutes are offering bioinformatics courses at different levels with variations in the course contents and degree of detailing. BioInformatics National Certification (BINC) examination initiated in 2005 by DBT provides a common yardstick to assess the knowledge and skill sets of students passing out of various institutions. The potential for broadening the scope of bioinformatics to transform it into a data intensive discovery discipline is discussed. This necessitates introduction of amendments in the existing curricula to accommodate the upcoming developments.
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
2010-01-01
Background Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. Description An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date. Conclusions Hadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms. PMID:21210976
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.
Taylor, Ronald C
2010-12-21
Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date. Hadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms.
Generative Topic Modeling in Image Data Mining and Bioinformatics Studies
ERIC Educational Resources Information Center
Chen, Xin
2012-01-01
Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…
Bioinformatics education dissemination with an evolutionary problem solving perspective.
Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J
2010-11-01
Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.
Orozco, Allan; Morera, Jessica; Jiménez, Sergio; Boza, Ricardo
2013-09-01
Today, Bioinformatics has become a scientific discipline with great relevance for the Molecular Biosciences and for the Omics sciences in general. Although developed countries have progressed with large strides in Bioinformatics education and research, in other regions, such as Central America, the advances have occurred in a gradual way and with little support from the Academia, either at the undergraduate or graduate level. To address this problem, the University of Costa Rica's Medical School, a regional leader in Bioinformatics in Central America, has been conducting a series of Bioinformatics workshops, seminars and courses, leading to the creation of the region's first Bioinformatics Master's Degree. The recent creation of the Central American Bioinformatics Network (BioCANET), associated to the deployment of a supporting computational infrastructure (HPC Cluster) devoted to provide computing support for Molecular Biology in the region, is providing a foundational stone for the development of Bioinformatics in the area. Central American bioinformaticians have participated in the creation of as well as co-founded the Iberoamerican Bioinformatics Society (SOIBIO). In this article, we review the most recent activities in education and research in Bioinformatics from several regional institutions. These activities have resulted in further advances for Molecular Medicine, Agriculture and Biodiversity research in Costa Rica and the rest of the Central American countries. Finally, we provide summary information on the first Central America Bioinformatics International Congress, as well as the creation of the first Bioinformatics company (Indromics Bioinformatics), spin-off the Academy in Central America and the Caribbean.
The 2015 Bioinformatics Open Source Conference (BOSC 2015)
Harris, Nomi L.; Cock, Peter J. A.; Lapp, Hilmar
2016-01-01
The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included “Data Science;” “Standards and Interoperability;” “Open Science and Reproducibility;” “Translational Bioinformatics;” “Visualization;” and “Bioinformatics Open Source Project Updates”. In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled “Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community,” that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653
Bioinformatics goes back to the future.
Miller, Crispin J; Attwood, Teresa K
2003-02-01
The need to turn raw data into knowledge has led the bioinformatics field to focus increasingly on the manipulation of information. By drawing parallels with both cryptography and artificial intelligence, we can develop an understanding of the changes that are occurring in bioinformatics, and how these changes are likely to influence the bioinformatics job market.
ERIC Educational Resources Information Center
Miskowski, Jennifer A.; Howard, David R.; Abler, Michael L.; Grunwald, Sandra K.
2007-01-01
Over the past 10 years, there has been a technical revolution in the life sciences leading to the emergence of a new discipline called bioinformatics. In response, bioinformatics-related topics have been incorporated into various undergraduate courses along with the development of new courses solely focused on bioinformatics. This report describes…
Computational biology and bioinformatics in Nigeria.
Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi
2014-04-01
Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.
Computational Biology and Bioinformatics in Nigeria
Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi
2014-01-01
Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310
Learning nucleic acids solving by bioinformatics problems.
Nunes, Rhewter; Barbosa de Almeida Júnior, Edivaldo; Pessoa Pinto de Menezes, Ivandilson; Malafaia, Guilherme
2015-01-01
The article describes the development of a new approach to teach molecular biology to undergraduate biology students. The 34 students who participated in this research belonged to the first period of the Biological Sciences teaching course of the Instituto Federal Goiano at Urutaí Campus, Brazil. They were registered in Cell Biology in the first semester of 2013. They received four 55 min-long expository/dialogued lectures that covered the content of "structure and functions of nucleic acids". Later the students were invited to attend four meetings (in a computer laboratory) in which some concepts of Bioinformatics were presented and some problems of the Rosalind platform were solved. The observations we report here are very useful as a broad groundwork to development new research. An interesting possibility is research into the effects of bioinformatics interventions that improve molecular biology learning. © 2015 The International Union of Biochemistry and Molecular Biology.
Wightman, Bruce; Hark, Amy T
2012-01-01
The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative. Copyright © 2012 Wiley Periodicals, Inc.
Bioinformatics Education—Perspectives and Challenges out of Africa
Adebiyi, Ezekiel F.; Alzohairy, Ahmed M.; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J.; Panji, Sumir; Patterton, Hugh-G.
2015-01-01
The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350
ERIC Educational Resources Information Center
Wightman, Bruce; Hark, Amy T.
2012-01-01
The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…
Application of machine learning methods in bioinformatics
NASA Astrophysics Data System (ADS)
Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen
2018-05-01
Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.
The growing need for microservices in bioinformatics.
Williams, Christopher L; Sica, Jeffrey C; Killen, Robert T; Balis, Ulysses G J
2016-01-01
Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Bioinformatics relies on nimble IT framework which can adapt to changing requirements. To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics. Use of the microservices framework is an effective methodology for the fabrication and implementation of reliable and innovative software, made possible in a highly collaborative setting.
The growing need for microservices in bioinformatics
Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.
2016-01-01
Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective methodology for the fabrication and implementation of reliable and innovative software, made possible in a highly collaborative setting. PMID:27994937
The Topology Prediction of Membrane Proteins: A Web-Based Tutorial.
Kandemir-Cavas, Cagin; Cavas, Levent; Alyuruk, Hakan
2018-06-01
There is a great need for development of educational materials on the transfer of current bioinformatics knowledge to undergraduate students in bioscience departments. In this study, it is aimed to prepare an example in silico laboratory tutorial on the topology prediction of membrane proteins by bioinformatics tools. This laboratory tutorial is prepared for biochemistry lessons at bioscience departments (biology, chemistry, biochemistry, molecular biology and genetics, and faculty of medicine). The tutorial is intended for students who have not taken a bioinformatics course yet or already have taken a course as an introduction to bioinformatics. The tutorial is based on step-by-step explanations with illustrations. It can be applied under supervision of an instructor in the lessons, or it can be used as a self-study guide by students. In the tutorial, membrane-spanning regions and α-helices of membrane proteins were predicted by internet-based bioinformatics tools. According to the results achieved from internet-based bioinformatics tools, the algorithms and parameters used were effective on the accuracy of prediction. The importance of this laboratory tutorial lies on the facts that it provides an introduction to the bioinformatics and that it also demonstrates an in silico laboratory application to the students at natural sciences. The presented example education material is applicable easily at all departments that have internet connection. This study presents an alternative education material to the students in biochemistry laboratories in addition to classical laboratory experiments.
Development of Bioinformatics Infrastructure for Genomics Research.
Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem
2017-06-01
Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for downstream interpretation of prioritized variants. To provide support for these and other bioinformatics queries, an online bioinformatics helpdesk backed by broad consortium expertise has been established. Further support is provided by means of various modes of bioinformatics training. For the past 4 years, the development of infrastructure support and human capacity through H3ABioNet, have significantly contributed to the establishment of African scientific networks, data analysis facilities, and training programs. Here, we describe the infrastructure and how it has affected genomics and bioinformatics research in Africa. Copyright © 2017 World Heart Federation (Geneva). Published by Elsevier B.V. All rights reserved.
Schneider, Maria Victoria; Griffin, Philippa C; Tyagi, Sonika; Flannery, Madison; Dayalan, Saravanan; Gladman, Simon; Watson-Haigh, Nathan; Bayer, Philipp E; Charleston, Michael; Cooke, Ira; Cook, Rob; Edwards, Richard J; Edwards, David; Gorse, Dominique; McConville, Malcolm; Powell, David; Wilkins, Marc R; Lonie, Andrew
2017-06-30
EMBL Australia Bioinformatics Resource (EMBL-ABR) is a developing national research infrastructure, providing bioinformatics resources and support to life science and biomedical researchers in Australia. EMBL-ABR comprises 10 geographically distributed national nodes with one coordinating hub, with current funding provided through Bioplatforms Australia and the University of Melbourne for its initial 2-year development phase. The EMBL-ABR mission is to: (1) increase Australia's capacity in bioinformatics and data sciences; (2) contribute to the development of training in bioinformatics skills; (3) showcase Australian data sets at an international level and (4) enable engagement in international programs. The activities of EMBL-ABR are focussed in six key areas, aligning with comparable international initiatives such as ELIXIR, CyVerse and NIH Commons. These key areas-Tools, Data, Standards, Platforms, Compute and Training-are described in this article. © The Author 2017. Published by Oxford University Press.
Mello, Luciane V; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan
2017-01-01
Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student-staff partnership and peer-learning. It was centred on the experiences of peer-facilitators, students who had previously taken a postgraduate bioinformatics module, and had applied knowledge and skills gained from it to their own research. It aimed to demonstrate to peer-receivers, current students, how bioinformatics could be used in their own research while developing peer-facilitators' teaching and mentoring skills. This student-centred approach was well received by the peer-receivers, who claimed to have gained improved understanding of bioinformatics and its relevance to research. Equally, peer-facilitators also developed a better understanding of the subject and appreciated that the activity was a rare and invaluable opportunity to develop their teaching and mentoring skills, enhancing their employability.
Mello, Luciane V.; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan
2017-01-01
Abstract Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student–staff partnership and peer-learning. It was centred on the experiences of peer-facilitators, students who had previously taken a postgraduate bioinformatics module, and had applied knowledge and skills gained from it to their own research. It aimed to demonstrate to peer-receivers, current students, how bioinformatics could be used in their own research while developing peer-facilitators’ teaching and mentoring skills. This student-centred approach was well received by the peer-receivers, who claimed to have gained improved understanding of bioinformatics and its relevance to research. Equally, peer-facilitators also developed a better understanding of the subject and appreciated that the activity was a rare and invaluable opportunity to develop their teaching and mentoring skills, enhancing their employability. PMID:29098185
Carving a niche: establishing bioinformatics collaborations
Lyon, Jennifer A.; Tennant, Michele R.; Messner, Kevin R.; Osterbur, David L.
2006-01-01
Objectives: The paper describes collaborations and partnerships developed between library bioinformatics programs and other bioinformatics-related units at four academic institutions. Methods: A call for information on bioinformatics partnerships was made via email to librarians who have participated in the National Center for Biotechnology Information's Advanced Workshop for Bioinformatics Information Specialists. Librarians from Harvard University, the University of Florida, the University of Minnesota, and Vanderbilt University responded and expressed willingness to contribute information on their institutions, programs, services, and collaborating partners. Similarities and differences in programs and collaborations were identified. Results: The four librarians have developed partnerships with other units on their campuses that can be categorized into the following areas: knowledge management, instruction, and electronic resource support. All primarily support freely accessible electronic resources, while other campus units deal with fee-based ones. These demarcations are apparent in resource provision as well as in subsequent support and instruction. Conclusions and Recommendations: Through environmental scanning and networking with colleagues, librarians who provide bioinformatics support can develop fruitful collaborations. Visibility is key to building collaborations, as is broad-based thinking in terms of potential partners. PMID:16888668
Component-Based Approach for Educating Students in Bioinformatics
ERIC Educational Resources Information Center
Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.
2009-01-01
There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…
Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad
Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologiesmore » in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics into courses or independent research projects requires infrastructure for organizing and assessing student work. Here, we present a new platform for faculty to keep current with the rapidly changing field of bioinformatics, the Integrated Microbial Genomes Annotation Collaboration Toolkit (IMG-ACT). It was developed by instructors from both research-intensive and predominately undergraduate institutions in collaboration with the Department of Energy-Joint Genome Institute (DOE-JGI) as a means to innovate and update undergraduate education and faculty development. The IMG-ACT program provides a cadre of tools, including access to a clearinghouse of genome sequences, bioinformatics databases, data storage, instructor course management, and student notebooks for organizing the results of their bioinformatic investigations. In the process, IMG-ACT makes it feasible to provide undergraduate research opportunities to a greater number and diversity of students, in contrast to the traditional mentor-to-student apprenticeship model for undergraduate research, which can be too expensive and time-consuming to provide for every undergraduate. The IMG-ACT serves as the hub for the network of faculty and students that use the system for microbial genome analysis. Open access of the IMG-ACT infrastructure to participating schools ensures that all types of higher education institutions can utilize it. With the infrastructure in place, faculty can focus their efforts on the pedagogy of bioinformatics, involvement of students in research, and use of this tool for their own research agenda. What the original faculty members of the IMG-ACT development team present here is an overview of how the IMG-ACT program has affected our development in terms of teaching and research with the hopes that it will inspire more faculty to get involved.« less
Bioinformatics projects supporting life-sciences learning in high schools.
Marques, Isabel; Almeida, Paulo; Alves, Renato; Dias, Maria João; Godinho, Ana; Pereira-Leal, José B
2014-01-01
The interdisciplinary nature of bioinformatics makes it an ideal framework to develop activities enabling enquiry-based learning. We describe here the development and implementation of a pilot project to use bioinformatics-based research activities in high schools, called "Bioinformatics@school." It includes web-based research projects that students can pursue alone or under teacher supervision and a teacher training program. The project is organized so as to enable discussion of key results between students and teachers. After successful trials in two high schools, as measured by questionnaires, interviews, and assessment of knowledge acquisition, the project is expanding by the action of the teachers involved, who are helping us develop more content and are recruiting more teachers and schools.
Revote, Jerico; Suchecki, Radosław; Tyagi, Sonika; Corley, Susan M.; Shang, Catherine A.; McGrath, Annette
2017-01-01
Abstract There is a clear demand for hands-on bioinformatics training. The development of bioinformatics workshop content is both time-consuming and expensive. Therefore, enabling trainers to develop bioinformatics workshops in a way that facilitates reuse is becoming increasingly important. The most widespread practice for sharing workshop content is through making PDF, PowerPoint and Word documents available online. While this effort is to be commended, such content is usually not so easy to reuse or repurpose and does not capture all the information required for a third party to rerun a workshop. We present an open, collaborative framework for developing and maintaining, reusable and shareable hands-on training workshop content. PMID:26984618
BIAS: Bioinformatics Integrated Application Software.
Finak, G; Godin, N; Hallett, M; Pepin, F; Rajabi, Z; Srivastava, V; Tang, Z
2005-04-15
We introduce a development platform especially tailored to Bioinformatics research and software development. BIAS (Bioinformatics Integrated Application Software) provides the tools necessary for carrying out integrative Bioinformatics research requiring multiple datasets and analysis tools. It follows an object-relational strategy for providing persistent objects, allows third-party tools to be easily incorporated within the system and supports standards and data-exchange protocols common to Bioinformatics. BIAS is an OpenSource project and is freely available to all interested users at http://www.mcb.mcgill.ca/~bias/. This website also contains a paper containing a more detailed description of BIAS and a sample implementation of a Bayesian network approach for the simultaneous prediction of gene regulation events and of mRNA expression from combinations of gene regulation events. hallett@mcb.mcgill.ca.
Influenza research database: an integrated bioinformatics resource for influenza virus research
USDA-ARS?s Scientific Manuscript database
The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics, an...
Cancer Bioinformatics for Updating Anticancer Drug Developments and Personalized Therapeutics.
Lu, Da-Yong; Qu, Rong-Xin; Lu, Ting-Ren; Wu, Hong-Ying
2017-01-01
Last two to three decades, this world witnesses a rapid progress of biomarkers and bioinformatics technologies. Cancer bioinformatics is one of such important omics branches for experimental/clinical studies and applications. Same as other biological techniques or systems, bioinformatics techniques will be widely used. But they are presently not omni-potent. Despite great popularity and improvements, cancer bioinformatics has its own limitations and shortcomings at this stage of technical advancements. This article will offer a panorama of bioinformatics in cancer researches and clinical therapeutic applications-possible advantages and limitations relating to cancer therapeutics. A lot of beneficial capabilities and outcomes have been described. As a result, a successful new era for cancer bioinformatics is waiting for us if we can adhere on scientific studies of cancer bioinformatics in malignant- origin mining, medical verifications and clinical diagnostic applications. Cancer bioinformatics gave a great significance in disease diagnosis and therapeutic predictions. Many creative ideas and future perspectives are highlighted. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H
2012-01-01
Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.
Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.
2012-01-01
Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832
NASA Astrophysics Data System (ADS)
Balqis, Widodo, Lukiati, Betty; Amin, Mohamad
2017-05-01
A way to improve the quality of learning in the course of Plant Metabolism in the Department of Biology, State University of Malang, is to develop teaching materials. This research evaluates the needs of bioinformatics-based teaching material in the course Plant Metabolism by the Analyze, Design, Develop, Implement, and Evaluate (ADDIE) development model. Data were collected through questionnaires distributed to the students in the Plant Metabolism course of the Department of Biology, University of Malang, and analysis of the plan of lectures semester (RPS). Learning gains of this course show that it is not yet integrated into the field of bioinformatics. All respondents stated that plant metabolism books do not include bioinformatics and fail to explain the metabolism of a chemical compound of a local plant in Indonesia. Respondents thought that bioinformatics can explain examples and metabolism of a secondary metabolite analysis techniques and discuss potential medicinal compounds from local plants. As many as 65% of the respondents said that the existing metabolism book could not be used to understand secondary metabolism in lectures of plant metabolism. Therefore, the development of teaching materials including plant metabolism-based bioinformatics is important to improve the understanding of the lecture material in plant metabolism.
Bioinformatics Projects Supporting Life-Sciences Learning in High Schools
Marques, Isabel; Almeida, Paulo; Alves, Renato; Dias, Maria João; Godinho, Ana; Pereira-Leal, José B.
2014-01-01
The interdisciplinary nature of bioinformatics makes it an ideal framework to develop activities enabling enquiry-based learning. We describe here the development and implementation of a pilot project to use bioinformatics-based research activities in high schools, called “Bioinformatics@school.” It includes web-based research projects that students can pursue alone or under teacher supervision and a teacher training program. The project is organized so as to enable discussion of key results between students and teachers. After successful trials in two high schools, as measured by questionnaires, interviews, and assessment of knowledge acquisition, the project is expanding by the action of the teachers involved, who are helping us develop more content and are recruiting more teachers and schools. PMID:24465192
The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.
Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young
2016-03-01
Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.
Bioinformatics in protein kinases regulatory network and drug discovery.
Chen, Qingfeng; Luo, Haiqiong; Zhang, Chengqi; Chen, Yi-Ping Phoebe
2015-04-01
Protein kinases have been implicated in a number of diseases, where kinases participate many aspects that control cell growth, movement and death. The deregulated kinase activities and the knowledge of these disorders are of great clinical interest of drug discovery. The most critical issue is the development of safe and efficient disease diagnosis and treatment for less cost and in less time. It is critical to develop innovative approaches that aim at the root cause of a disease, not just its symptoms. Bioinformatics including genetic, genomic, mathematics and computational technologies, has become the most promising option for effective drug discovery, and has showed its potential in early stage of drug-target identification and target validation. It is essential that these aspects are understood and integrated into new methods used in drug discovery for diseases arisen from deregulated kinase activity. This article reviews bioinformatics techniques for protein kinase data management and analysis, kinase pathways and drug targets and describes their potential application in pharma ceutical industry. Copyright © 2015 Elsevier Inc. All rights reserved.
Biotool2Web: creating simple Web interfaces for bioinformatics applications.
Shahid, Mohammad; Alam, Intikhab; Fuellen, Georg
2006-01-01
Currently there are many bioinformatics applications being developed, but there is no easy way to publish them on the World Wide Web. We have developed a Perl script, called Biotool2Web, which makes the task of creating web interfaces for simple ('home-made') bioinformatics applications quick and easy. Biotool2Web uses an XML document containing the parameters to run the tool on the Web, and generates the corresponding HTML and common gateway interface (CGI) files ready to be published on a web server. This tool is available for download at URL http://www.uni-muenster.de/Bioinformatics/services/biotool2web/ Georg Fuellen (fuellen@alum.mit.edu).
India's Computational Biology Growth and Challenges.
Chakraborty, Chiranjib; Bandyopadhyay, Sanghamitra; Agoramoorthy, Govindasamy
2016-09-01
India's computational science is growing swiftly due to the outburst of internet and information technology services. The bioinformatics sector of India has been transforming rapidly by creating a competitive position in global bioinformatics market. Bioinformatics is widely used across India to address a wide range of biological issues. Recently, computational researchers and biologists are collaborating in projects such as database development, sequence analysis, genomic prospects and algorithm generations. In this paper, we have presented the Indian computational biology scenario highlighting bioinformatics-related educational activities, manpower development, internet boom, service industry, research activities, conferences and trainings undertaken by the corporate and government sectors. Nonetheless, this new field of science faces lots of challenges.
Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology
ERIC Educational Resources Information Center
Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.
2008-01-01
Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…
Zhang, Bai-xia; Li, Jian; Gu, Hao; Li, Qiang; Zhang, Qi; Zhang, Tian-jiao; Wang, Yun; Cai, Cheng-ke
2015-01-01
Due to the proved clinical efficacy, Shuang-Huang-Lian (SHL) has developed a variety of dosage forms. However, the in-depth research on targets and pharmacological mechanisms of SHL preparations was scarce. In the presented study, the bioinformatics approaches were adopted to integrate relevant data and biological information. As a result, a PPI network was built and the common topological parameters were characterized. The results suggested that the PPI network of SHL exhibited a scale-free property and modular architecture. The drug target network of SHL was structured with 21 functional modules. According to certain modules and pharmacological effects distribution, an antitumor effect and potential drug targets were predicted. A biological network which contained 26 subnetworks was constructed to elucidate the antipneumonia mechanism of SHL. We also extracted the subnetwork to explicitly display the pathway where one effective component acts on the pneumonia related targets. In conclusions, a bioinformatics approach was established for exploring the drug targets, pharmacological activity distribution, effective components of SHL, and its mechanism of antipneumonia. Above all, we identified the effective components and disclosed the mechanism of SHL from the view of system. PMID:26495421
Zhou, Zhiwei; Xiong, Xin; Zhu, Zheng-Jiang
2017-07-15
In metabolomics, rigorous structural identification of metabolites presents a challenge for bioinformatics. The use of collision cross-section (CCS) values of metabolites derived from ion mobility-mass spectrometry effectively increases the confidence of metabolite identification, but this technique suffers from the limit number of available CCS values. Currently, there is no software available for rapidly generating the metabolites' CCS values. Here, we developed the first web server, namely, MetCCS Predictor, for predicting CCS values. It can predict the CCS values of metabolites using molecular descriptors within a few seconds. Common users with limited background on bioinformatics can benefit from this software and effectively improve the metabolite identification in metabolomics. The web server is freely available at: http://www.metabolomics-shanghai.org/MetCCS/ . jiangzhu@sioc.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ERIC Educational Resources Information Center
Rowe, Laura
2017-01-01
An introductory bioinformatics laboratory experiment focused on protein analysis has been developed that is suitable for undergraduate students in introductory biochemistry courses. The laboratory experiment is designed to be potentially used as a "stand-alone" activity in which students are introduced to basic bioinformatics tools and…
A Summer Program Designed to Educate College Students for Careers in Bioinformatics
ERIC Educational Resources Information Center
Krilowicz, Beverly; Johnston, Wendie; Sharp, Sandra B.; Warter-Perez, Nancy; Momand, Jamil
2007-01-01
A summer program was created for undergraduates and graduate students that teaches bioinformatics concepts, offers skills in professional development, and provides research opportunities in academic and industrial institutions. We estimate that 34 of 38 graduates (89%) are in a career trajectory that will use bioinformatics. Evidence from…
Is there room for ethics within bioinformatics education?
Taneri, Bahar
2011-07-01
When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.
Wren, Jonathan D
2016-09-01
To analyze the relative proportion of bioinformatics papers and their non-bioinformatics counterparts in the top 20 most cited papers annually for the past two decades. When defining bioinformatics papers as encompassing both those that provide software for data analysis or methods underlying data analysis software, we find that over the past two decades, more than a third (34%) of the most cited papers in science were bioinformatics papers, which is approximately a 31-fold enrichment relative to the total number of bioinformatics papers published. More than half of the most cited papers during this span were bioinformatics papers. Yet, the average 5-year JIF of top 20 bioinformatics papers was 7.7, whereas the average JIF for top 20 non-bioinformatics papers was 25.8, significantly higher (P < 4.5 × 10(-29)). The 20-year trend in the average JIF between the two groups suggests the gap does not appear to be significantly narrowing. For a sampling of the journals producing top papers, bioinformatics journals tended to have higher Gini coefficients, suggesting that development of novel bioinformatics resources may be somewhat 'hit or miss'. That is, relative to other fields, bioinformatics produces some programs that are extremely widely adopted and cited, yet there are fewer of intermediate success. jdwren@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Broad issues to consider for library involvement in bioinformatics*
Geer, Renata C.
2006-01-01
Background: The information landscape in biological and medical research has grown far beyond literature to include a wide variety of databases generated by research fields such as molecular biology and genomics. The traditional role of libraries to collect, organize, and provide access to information can expand naturally to encompass these new data domains. Methods: This paper discusses the current and potential role of libraries in bioinformatics using empirical evidence and experience from eleven years of work in user services at the National Center for Biotechnology Information. Findings: Medical and science libraries over the last decade have begun to establish educational and support programs to address the challenges users face in the effective and efficient use of a plethora of molecular biology databases and retrieval and analysis tools. As more libraries begin to establish a role in this area, the issues they face include assessment of user needs and skills, identification of existing services, development of plans for new services, recruitment and training of specialized staff, and establishment of collaborations with bioinformatics centers at their institutions. Conclusions: Increasing library involvement in bioinformatics can help address information needs of a broad range of students, researchers, and clinicians and ultimately help realize the power of bioinformatics resources in making new biological discoveries. PMID:16888662
Protein Bioinformatics Databases and Resources
Chen, Chuming; Huang, Hongzhan; Wu, Cathy H.
2017-01-01
Many publicly available data repositories and resources have been developed to support protein related information management, data-driven hypothesis generation and biological knowledge discovery. To help researchers quickly find the appropriate protein related informatics resources, we present a comprehensive review (with categorization and description) of major protein bioinformatics databases in this chapter. We also discuss the challenges and opportunities for developing next-generation protein bioinformatics databases and resources to support data integration and data analytics in the Big Data era. PMID:28150231
Bioconductor: open software development for computational biology and bioinformatics
Gentleman, Robert C; Carey, Vincent J; Bates, Douglas M; Bolstad, Ben; Dettling, Marcel; Dudoit, Sandrine; Ellis, Byron; Gautier, Laurent; Ge, Yongchao; Gentry, Jeff; Hornik, Kurt; Hothorn, Torsten; Huber, Wolfgang; Iacus, Stefano; Irizarry, Rafael; Leisch, Friedrich; Li, Cheng; Maechler, Martin; Rossini, Anthony J; Sawitzki, Gunther; Smith, Colin; Smyth, Gordon; Tierney, Luke; Yang, Jean YH; Zhang, Jianhua
2004-01-01
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples. PMID:15461798
Oluwagbemi, Olugbenga O; Adewumi, Adewole; Esuruoso, Abimbola
2012-01-01
Computational biology and bioinformatics are gradually gaining grounds in Africa and other developing nations of the world. However, in these countries, some of the challenges of computational biology and bioinformatics education are inadequate infrastructures, and lack of readily-available complementary and motivational tools to support learning as well as research. This has lowered the morale of many promising undergraduates, postgraduates and researchers from aspiring to undertake future study in these fields. In this paper, we developed and described MACBenAbim (Multi-platform Mobile Application for Computational Biology and Bioinformatics), a flexible user-friendly tool to search for, define and describe the meanings of keyterms in computational biology and bioinformatics, thus expanding the frontiers of knowledge of the users. This tool also has the capability of achieving visualization of results on a mobile multi-platform context. MACBenAbim is available from the authors for non-commercial purposes.
Brown, James A L
2016-05-06
A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion, qualitative student-based module evaluation and the novelty, scientific validity and quality of written student reports. Bioinformatics is often the starting point for laboratory-based research projects, therefore high importance was placed on allowing students to individually develop and apply processes and methods of scientific research. Students led a bioinformatic inquiry-based project (within a framework of inquiry), discovering, justifying and exploring individually discovered research targets. Detailed assessable reports were produced, displaying data generated and the resources used. Mimicking research settings, undergraduates were divided into small collaborative groups, with distinctive central themes. The module was evaluated by assessing the quality and originality of the students' targets through reports, reflecting students' use and understanding of concepts and tools required to generate their data. Furthermore, evaluation of the bioinformatic module was assessed semi-quantitatively using pre- and post-module quizzes (a non-assessable activity, not contributing to their grade), which incorporated process- and content-specific questions (indicative of their use of the online tools). Qualitative assessment of the teaching intervention was performed using post-module surveys, exploring student satisfaction and other module specific elements. Overall, a positive experience was found, as was a post module increase in correct process-specific answers. In conclusion, an inquiry-based peer-assisted learning module increased students' engagement, practical bioinformatic skills and process-specific knowledge. © 2016 by The International Union of Biochemistry and Molecular Biology, 44:304-313 2016. © 2016 The International Union of Biochemistry and Molecular Biology.
Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D Blaine; Langeland, James A
2009-01-01
Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option. Furthermore, we believe that a true interdisciplinary science experience would be best served by introduction of bioinformatics modules within existing courses in biology and chemistry and other complementary departments. To that end, with support from the Howard Hughes Medical Institute, we have developed over a dozen independent bioinformatics modules for our students that are incorporated into courses ranging from general chemistry and biology, advanced specialty courses, and classes in complementary disciplines such as computer science, mathematics, and physics. These activities have largely promoted active learning in our classrooms and have enhanced student understanding of course materials. Herein, we describe our program, the activities we have developed, and assessment of our endeavors in this area. Copyright © 2009 International Union of Biochemistry and Molecular Biology, Inc.
Translational bioinformatics: linking the molecular world to the clinical world.
Altman, R B
2012-06-01
Translational bioinformatics represents the union of translational medicine and bioinformatics. Translational medicine moves basic biological discoveries from the research bench into the patient-care setting and uses clinical observations to inform basic biology. It focuses on patient care, including the creation of new diagnostics, prognostics, prevention strategies, and therapies based on biological discoveries. Bioinformatics involves algorithms to represent, store, and analyze basic biological data, including DNA sequence, RNA expression, and protein and small-molecule abundance within cells. Translational bioinformatics spans these two fields; it involves the development of algorithms to analyze basic molecular and cellular data with an explicit goal of affecting clinical care.
Liaw, Wen-Jinn; Tsao, Cheng-Ming; Huang, Go-Shine; Wu, Chin-Chen; Ho, Shung-Tai; Wang, Jhi-Joung; Tao, Yuan-Xiang; Shui, Hao-Ai
2014-01-01
Introduction Morphine is the most effective pain-relieving drug, but it can cause unwanted side effects. Direct neuraxial administration of morphine to spinal cord not only can provide effective, reliable pain relief but also can prevent the development of supraspinal side effects. However, repeated neuraxial administration of morphine may still lead to morphine tolerance. Methods To better understand the mechanism that causes morphine tolerance, we induced tolerance in rats at the spinal cord level by giving them twice-daily injections of morphine (20 µg/10 µL) for 4 days. We confirmed tolerance by measuring paw withdrawal latencies and maximal possible analgesic effect of morphine on day 5. We then carried out phosphoproteomic analysis to investigate the global phosphorylation of spinal proteins associated with morphine tolerance. Finally, pull-down assays were used to identify phosphorylated types and sites of 14-3-3 proteins, and bioinformatics was applied to predict biological networks impacted by the morphine-regulated proteins. Results Our proteomics data showed that repeated morphine treatment altered phosphorylation of 10 proteins in the spinal cord. Pull-down assays identified 2 serine/threonine phosphorylated sites in 14-3-3 proteins. Bioinformatics further revealed that morphine impacted on cytoskeletal reorganization, neuroplasticity, protein folding and modulation, signal transduction and biomolecular metabolism. Conclusions Repeated morphine administration may affect multiple biological networks by altering protein phosphorylation. These data may provide insight into the mechanism that underlies the development of morphine tolerance. PMID:24392096
BioStar: an online question & answer resource for the bioinformatics community
USDA-ARS?s Scientific Manuscript database
Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...
Bioinformatics Goes to School—New Avenues for Teaching Contemporary Biology
Wood, Louisa; Gebhardt, Philipp
2013-01-01
Since 2010, the European Molecular Biology Laboratory's (EMBL) Heidelberg laboratory and the European Bioinformatics Institute (EMBL-EBI) have jointly run bioinformatics training courses developed specifically for secondary school science teachers within Europe and EMBL member states. These courses focus on introducing bioinformatics, databases, and data-intensive biology, allowing participants to explore resources and providing classroom-ready materials to support them in sharing this new knowledge with their students. In this article, we chart our progress made in creating and running three bioinformatics training courses, including how the course resources are received by participants and how these, and bioinformatics in general, are subsequently used in the classroom. We assess the strengths and challenges of our approach, and share what we have learned through our interactions with European science teachers. PMID:23785266
Panji, Sumir; Fernandes, Pedro L.; Judge, David P.; Ghouila, Amel; Salifu, Samson P.; Ahmed, Rehab; Kayondo, Jonathan; Ssemwanga, Deogratius
2017-01-01
Africa is not unique in its need for basic bioinformatics training for individuals from a diverse range of academic backgrounds. However, particular logistical challenges in Africa, most notably access to bioinformatics expertise and internet stability, must be addressed in order to meet this need on the continent. H3ABioNet (www.h3abionet.org), the Pan African Bioinformatics Network for H3Africa, has therefore developed an innovative, free-of-charge “Introduction to Bioinformatics” course, taking these challenges into account as part of its educational efforts to provide on-site training and develop local expertise inside its network. A multiple-delivery–mode learning model was selected for this 3-month course in order to increase access to (mostly) African, expert bioinformatics trainers. The content of the course was developed to include a range of fundamental bioinformatics topics at the introductory level. For the first iteration of the course (2016), classrooms with a total of 364 enrolled participants were hosted at 20 institutions across 10 African countries. To ensure that classroom success did not depend on stable internet, trainers pre-recorded their lectures, and classrooms downloaded and watched these locally during biweekly contact sessions. The trainers were available via video conferencing to take questions during contact sessions, as well as via online “question and discussion” forums outside of contact session time. This learning model, developed for a resource-limited setting, could easily be adapted to other settings. PMID:28981516
Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young
2017-08-15
Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.
Hobbie, Kevin A; Peterson, Elena S; Barton, Michael L; Waters, Katrina M; Anderson, Kim A
2012-08-01
Large collaborative centers are a common model for accomplishing integrated environmental health research. These centers often include various types of scientific domains (e.g., chemistry, biology, bioinformatics) that are integrated to solve some of the nation's key economic or public health concerns. The Superfund Research Center (SRP) at Oregon State University (OSU) is one such center established in 2008 to study the emerging health risks of polycyclic aromatic hydrocarbons while using new technologies both in the field and laboratory. With outside collaboration at remote institutions, success for the center as a whole depends on the ability to effectively integrate data across all research projects and support cores. Therefore, the OSU SRP center developed a system that integrates environmental monitoring data with analytical chemistry data and downstream bioinformatics and statistics to enable complete "source-to-outcome" data modeling and information management. This article describes the development of this integrated information management system that includes commercial software for operational laboratory management and sample management in addition to open-source custom-built software for bioinformatics and experimental data management.
Hobbie, Kevin A.; Peterson, Elena S.; Barton, Michael L.; Waters, Katrina M.; Anderson, Kim A.
2012-01-01
Large collaborative centers are a common model for accomplishing integrated environmental health research. These centers often include various types of scientific domains (e.g. chemistry, biology, bioinformatics) that are integrated to solve some of the nation’s key economic or public health concerns. The Superfund Research Center (SRP) at Oregon State University (OSU) is one such center established in 2008 to study the emerging health risks of polycyclic aromatic hydrocarbons while utilizing new technologies both in the field and laboratory. With outside collaboration at remote institutions, success for the center as a whole depends on the ability to effectively integrate data across all research projects and support cores. Therefore, the OSU SRP center developed a system that integrates environmental monitoring data with analytical chemistry data and downstream bioinformatics and statistics to enable complete ‘source to outcome’ data modeling and information management. This article describes the development of this integrated information management system that includes commercial software for operational laboratory management and sample management in addition to open source custom built software for bioinformatics and experimental data management. PMID:22651935
Development of an undergraduate bioinformatics degree program at a liberal arts college.
Bagga, Paramjeet S
2012-09-01
The highly interdisciplinary field of bioinformatics has emerged as a powerful modern science. There has been a great demand for undergraduate- and graduate-level trained bioinformaticists in the industry as well in the academia. In order to address the needs for trained bioinformaticists, its curriculum must be offered at the undergraduate level, especially at four-year colleges, where a majority of the United States gets its education. There are many challenges in developing an undergraduate-level bioinformatics program that needs to be carefully designed as a well-integrated and cohesive interdisciplinary curriculum that prepares the students for a wide variety of career options. This article describes the challenges of establishing a highly interdisciplinary undergraduate major, the development of an undergraduate bioinformatics degree program at Ramapo College of New Jersey, and lessons learned in the last 10 years during its management.
ENFIN--A European network for integrative systems biology.
Kahlem, Pascal; Clegg, Andrew; Reisinger, Florian; Xenarios, Ioannis; Hermjakob, Henning; Orengo, Christine; Birney, Ewan
2009-11-01
Integration of biological data of various types and the development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both the generation of new bioinformatics tools and the experimental validation of computational predictions. With the aim of bridging the gap existing between standard wet laboratories and bioinformatics, the ENFIN Network runs integrative research projects to bring the latest computational techniques to bear directly on questions dedicated to systems biology in the wet laboratory environment. The Network maintains internally close collaboration between experimental and computational research, enabling a permanent cycling of experimental validation and improvement of computational prediction methods. The computational work includes the development of a database infrastructure (EnCORE), bioinformatics analysis methods and a novel platform for protein function analysis FuncNet.
The Development of Computational Biology in South Africa: Successes Achieved and Lessons Learnt
Mulder, Nicola J.; Christoffels, Alan; de Oliveira, Tulio; Gamieldien, Junaid; Hazelhurst, Scott; Joubert, Fourie; Kumuthini, Judit; Pillay, Ché S.; Snoep, Jacky L.; Tastan Bishop, Özlem; Tiffin, Nicki
2016-01-01
Bioinformatics is now a critical skill in many research and commercial environments as biological data are increasing in both size and complexity. South African researchers recognized this need in the mid-1990s and responded by working with the government as well as international bodies to develop initiatives to build bioinformatics capacity in the country. Significant injections of support from these bodies provided a springboard for the establishment of computational biology units at multiple universities throughout the country, which took on teaching, basic research and support roles. Several challenges were encountered, for example with unreliability of funding, lack of skills, and lack of infrastructure. However, the bioinformatics community worked together to overcome these, and South Africa is now arguably the leading country in bioinformatics on the African continent. Here we discuss how the discipline developed in the country, highlighting the challenges, successes, and lessons learnt. PMID:26845152
The Development of Computational Biology in South Africa: Successes Achieved and Lessons Learnt.
Mulder, Nicola J; Christoffels, Alan; de Oliveira, Tulio; Gamieldien, Junaid; Hazelhurst, Scott; Joubert, Fourie; Kumuthini, Judit; Pillay, Ché S; Snoep, Jacky L; Tastan Bishop, Özlem; Tiffin, Nicki
2016-02-01
Bioinformatics is now a critical skill in many research and commercial environments as biological data are increasing in both size and complexity. South African researchers recognized this need in the mid-1990s and responded by working with the government as well as international bodies to develop initiatives to build bioinformatics capacity in the country. Significant injections of support from these bodies provided a springboard for the establishment of computational biology units at multiple universities throughout the country, which took on teaching, basic research and support roles. Several challenges were encountered, for example with unreliability of funding, lack of skills, and lack of infrastructure. However, the bioinformatics community worked together to overcome these, and South Africa is now arguably the leading country in bioinformatics on the African continent. Here we discuss how the discipline developed in the country, highlighting the challenges, successes, and lessons learnt.
When cloud computing meets bioinformatics: a review.
Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong
2013-10-01
In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.
A Scientific Software Product Line for the Bioinformatics domain.
Costa, Gabriella Castro B; Braga, Regina; David, José Maria N; Campos, Fernanda
2015-08-01
Most specialized users (scientists) that use bioinformatics applications do not have suitable training on software development. Software Product Line (SPL) employs the concept of reuse considering that it is defined as a set of systems that are developed from a common set of base artifacts. In some contexts, such as in bioinformatics applications, it is advantageous to develop a collection of related software products, using SPL approach. If software products are similar enough, there is the possibility of predicting their commonalities, differences and then reuse these common features to support the development of new applications in the bioinformatics area. This paper presents the PL-Science approach which considers the context of SPL and ontology in order to assist scientists to define a scientific experiment, and to specify a workflow that encompasses bioinformatics applications of a given experiment. This paper also focuses on the use of ontologies to enable the use of Software Product Line in biological domains. In the context of this paper, Scientific Software Product Line (SSPL) differs from the Software Product Line due to the fact that SSPL uses an abstract scientific workflow model. This workflow is defined according to a scientific domain and using this abstract workflow model the products (scientific applications/algorithms) are instantiated. Through the use of ontology as a knowledge representation model, we can provide domain restrictions as well as add semantic aspects in order to facilitate the selection and organization of bioinformatics workflows in a Scientific Software Product Line. The use of ontologies enables not only the expression of formal restrictions but also the inferences on these restrictions, considering that a scientific domain needs a formal specification. This paper presents the development of the PL-Science approach, encompassing a methodology and an infrastructure, and also presents an approach evaluation. This evaluation presents case studies in bioinformatics, which were conducted in two renowned research institutions in Brazil. Copyright © 2015 Elsevier Inc. All rights reserved.
Revealing biological information using data structuring and automated learning.
Mohorianu, Irina; Moulton, Vincent
2010-11-01
The intermediary steps between a biological hypothesis, concretized in the input data, and meaningful results, validated using biological experiments, commonly employ bioinformatics tools. Starting with storage of the data and ending with a statistical analysis of the significance of the results, every step in a bioinformatics analysis has been intensively studied and the resulting methods and models patented. This review summarizes the bioinformatics patents that have been developed mainly for the study of genes, and points out the universal applicability of bioinformatics methods to other related studies such as RNA interference. More specifically, we overview the steps undertaken in the majority of bioinformatics analyses, highlighting, for each, various approaches that have been developed to reveal details from different perspectives. First we consider data warehousing, the first task that has to be performed efficiently, optimizing the structure of the database, in order to facilitate both the subsequent steps and the retrieval of information. Next, we review data mining, which occupies the central part of most bioinformatics analyses, presenting patents concerning differential expression, unsupervised and supervised learning. Last, we discuss how networks of interactions of genes or other players in the cell may be created, which help draw biological conclusions and have been described in several patents.
Extracting patterns of database and software usage from the bioinformatics literature
Duck, Geraint; Nenadic, Goran; Brass, Andy; Robertson, David L.; Stevens, Robert
2014-01-01
Motivation: As a natural consequence of being a computer-based discipline, bioinformatics has a strong focus on database and software development, but the volume and variety of resources are growing at unprecedented rates. An audit of database and software usage patterns could help provide an overview of developments in bioinformatics and community common practice, and comparing the links between resources through time could demonstrate both the persistence of existing software and the emergence of new tools. Results: We study the connections between bioinformatics resources and construct networks of database and software usage patterns, based on resource co-occurrence, that correspond to snapshots of common practice in the bioinformatics community. We apply our approach to pairings of phylogenetics software reported in the literature and argue that these could provide a stepping stone into the identification of scientific best practice. Availability and implementation: The extracted resource data, the scripts used for network generation and the resulting networks are available at http://bionerds.sourceforge.net/networks/ Contact: robert.stevens@manchester.ac.uk PMID:25161253
Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics
Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.
2012-01-01
With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849
An overview of topic modeling and its current applications in bioinformatics.
Liu, Lin; Tang, Lin; Dong, Wen; Yao, Shaowen; Zhou, Wei
2016-01-01
With the rapid accumulation of biological datasets, machine learning methods designed to automate data analysis are urgently needed. In recent years, so-called topic models that originated from the field of natural language processing have been receiving much attention in bioinformatics because of their interpretability. Our aim was to review the application and development of topic models for bioinformatics. This paper starts with the description of a topic model, with a focus on the understanding of topic modeling. A general outline is provided on how to build an application in a topic model and how to develop a topic model. Meanwhile, the literature on application of topic models to biological data was searched and analyzed in depth. According to the types of models and the analogy between the concept of document-topic-word and a biological object (as well as the tasks of a topic model), we categorized the related studies and provided an outlook on the use of topic models for the development of bioinformatics applications. Topic modeling is a useful method (in contrast to the traditional means of data reduction in bioinformatics) and enhances researchers' ability to interpret biological information. Nevertheless, due to the lack of topic models optimized for specific biological data, the studies on topic modeling in biological data still have a long and challenging road ahead. We believe that topic models are a promising method for various applications in bioinformatics research.
Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Coatrieux, Jean-Louis; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Porro, Ivan; Beltrame, Francesco; Tollis, Ioannis; Van der Lei, Johan
2007-03-08
The SYMBIOmatics Specific Support Action (SSA) is "an information gathering and dissemination activity" that seeks "to identify synergies between the bioinformatics and the medical informatics" domain to improve collaborative progress between both domains (ref. to http://www.symbiomatics.org). As part of the project experts in both research fields will be identified and approached through a survey. To provide input to the survey, the scientific literature was analysed to extract topics relevant to both medical informatics and bioinformatics. This paper presents results of a systematic analysis of the scientific literature from medical informatics research and bioinformatics research. In the analysis pairs of words (bigrams) from the leading bioinformatics and medical informatics journals have been used as indication of existing and emerging technologies and topics over the period 2000-2005 ("recent") and 1990-1990 ("past"). We identified emerging topics that were equally important to bioinformatics and medical informatics in recent years such as microarray experiments, ontologies, open source, text mining and support vector machines. Emerging topics that evolved only in bioinformatics were system biology, protein interaction networks and statistical methods for microarray analyses, whereas emerging topics in medical informatics were grid technology and tissue microarrays. We conclude that although both fields have their own specific domains of interest, they share common technological developments that tend to be initiated by new developments in biotechnology and computer science.
Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Coatrieux, Jean-Louis; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Porro, Ivan; Beltrame, Francesco; Tollis, Ioannis; Van der Lei, Johan
2007-01-01
Background The SYMBIOmatics Specific Support Action (SSA) is "an information gathering and dissemination activity" that seeks "to identify synergies between the bioinformatics and the medical informatics" domain to improve collaborative progress between both domains (ref. to ). As part of the project experts in both research fields will be identified and approached through a survey. To provide input to the survey, the scientific literature was analysed to extract topics relevant to both medical informatics and bioinformatics. Results This paper presents results of a systematic analysis of the scientific literature from medical informatics research and bioinformatics research. In the analysis pairs of words (bigrams) from the leading bioinformatics and medical informatics journals have been used as indication of existing and emerging technologies and topics over the period 2000–2005 ("recent") and 1990–1990 ("past"). We identified emerging topics that were equally important to bioinformatics and medical informatics in recent years such as microarray experiments, ontologies, open source, text mining and support vector machines. Emerging topics that evolved only in bioinformatics were system biology, protein interaction networks and statistical methods for microarray analyses, whereas emerging topics in medical informatics were grid technology and tissue microarrays. Conclusion We conclude that although both fields have their own specific domains of interest, they share common technological developments that tend to be initiated by new developments in biotechnology and computer science. PMID:17430562
Pollett, S; Leguia, M; Nelson, M I; Maljkovic Berry, I; Rutherford, G; Bausch, D G; Kasper, M; Jarman, R; Melendrez, M
2016-01-01
There is an increasing role for bioinformatic and phylogenetic analysis in tropical medicine research. However, scientists working in low- and middle-income regions may lack access to training opportunities in these methods. To help address this gap, a 5-day intensive bioinformatics workshop was offered in Lima, Peru. The syllabus is presented here for others who want to develop similar programs. To assess knowledge gained, a 20-point knowledge questionnaire was administered to participants (21 participants) before and after the workshop, covering topics on sequence quality control, alignment/formatting, database retrieval, models of evolution, sequence statistics, tree building, and results interpretation. Evolution/tree-building methods represented the lowest scoring domain at baseline and after the workshop. There was a considerable median gain in total knowledge scores (increase of 30%, p<0.001) with gains as high as 55%. A 5-day workshop model was effective in improving the pathogen-applied bioinformatics knowledge of scientists working in a middle-income country setting. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Expanding roles in a library-based bioinformatics service program: a case study
Li, Meng; Chen, Yi-Bu; Clintworth, William A
2013-01-01
Question: How can a library-based bioinformatics support program be implemented and expanded to continuously support the growing and changing needs of the research community? Setting: A program at a health sciences library serving a large academic medical center with a strong research focus is described. Methods: The bioinformatics service program was established at the Norris Medical Library in 2005. As part of program development, the library assessed users' bioinformatics needs, acquired additional funds, established and expanded service offerings, and explored additional roles in promoting on-campus collaboration. Results: Personnel and software have increased along with the number of registered software users and use of the provided services. Conclusion: With strategic efforts and persistent advocacy within the broader university environment, library-based bioinformatics service programs can become a key part of an institution's comprehensive solution to researchers' ever-increasing bioinformatics needs. PMID:24163602
Bioinformatics of cardiovascular miRNA biology.
Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas
2015-12-01
MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.
Ramharack, Pritika; Soliman, Mahmoud E S
2018-06-01
Originally developed for the analysis of biological sequences, bioinformatics has advanced into one of the most widely recognized domains in the scientific community. Despite this technological evolution, there is still an urgent need for nontoxic and efficient drugs. The onus now falls on the 'omics domain to meet this need by implementing bioinformatics techniques that will allow for the introduction of pioneering approaches in the rational drug design process. Here, we categorize an updated list of informatics tools and explore the capabilities of integrative bioinformatics in disease control. We believe that our review will serve as a comprehensive guide toward bioinformatics-oriented disease and drug discovery research. Copyright © 2018 Elsevier Ltd. All rights reserved.
The structural bioinformatics library: modeling in biomolecular science and beyond.
Cazals, Frédéric; Dreyfus, Tom
2017-04-01
Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off-the-shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library ( SBL , http://sbl.inria.fr ), a generic C ++/python cross-platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. The SBL involves four software components (1-4 thereafter). For end-users, the SBL provides ready to use, state-of-the-art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro-molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms , (3) biophysical models and (4) modules , the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. The SBL is available from http://sbl.inria.fr. Frederic.Cazals@inria.fr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Better bioinformatics through usability analysis.
Bolchini, Davide; Finkelstein, Anthony; Perrone, Vito; Nagl, Sylvia
2009-02-01
Improving the usability of bioinformatics resources enables researchers to find, interact with, share, compare and manipulate important information more effectively and efficiently. It thus enables researchers to gain improved insights into biological processes with the potential, ultimately, of yielding new scientific results. Usability 'barriers' can pose significant obstacles to a satisfactory user experience and force researchers to spend unnecessary time and effort to complete their tasks. The number of online biological databases available is growing and there is an expanding community of diverse users. In this context there is an increasing need to ensure the highest standards of usability. Using 'state-of-the-art' usability evaluation methods, we have identified and characterized a sample of usability issues potentially relevant to web bioinformatics resources, in general. These specifically concern the design of the navigation and search mechanisms available to the user. The usability issues we have discovered in our substantial case studies are undermining the ability of users to find the information they need in their daily research activities. In addition to characterizing these issues, specific recommendations for improvements are proposed leveraging proven practices from web and usability engineering. The methods and approach we exemplify can be readily adopted by the developers of bioinformatics resources.
Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh
2015-12-01
New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.
ERIC Educational Resources Information Center
Brown, James A. L.
2016-01-01
A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion,…
Bioinformatics in the orphan crops.
Armstead, Ian; Huang, Lin; Ravagnani, Adriana; Robson, Paul; Ougham, Helen
2009-11-01
Orphan crops are those which are grown as food, animal feed or other crops of some importance in agriculture, but which have not yet received the investment of research effort or funding required to develop significant public bioinformatics resources. Where an orphan crop is related to a well-characterised model plant species, comparative genomics and bioinformatics can often, though not always, be exploited to assist research and crop improvement. This review addresses some challenges and opportunities presented by bioinformatics in the orphan crops, using three examples: forage grasses from the genera Lolium and Festuca, forage legumes and the second generation energy crop Miscanthus.
Bioinformatics core competencies for undergraduate life sciences education.
Wilson Sayres, Melissa A; Hauser, Charles; Sierk, Michael; Robic, Srebrenka; Rosenwald, Anne G; Smith, Todd M; Triplett, Eric W; Williams, Jason J; Dinsdale, Elizabeth; Morgan, William R; Burnette, James M; Donovan, Samuel S; Drew, Jennifer C; Elgin, Sarah C R; Fowlks, Edison R; Galindo-Gonzalez, Sebastian; Goodman, Anya L; Grandgenett, Nealy F; Goller, Carlos C; Jungck, John R; Newman, Jeffrey D; Pearson, William; Ryder, Elizabeth F; Tosado-Acevedo, Rafael; Tapprich, William; Tobin, Tammy C; Toro-Martínez, Arlín; Welch, Lonnie R; Wright, Robin; Barone, Lindsay; Ebenbach, David; McWilliams, Mindy; Olney, Kimberly C; Pauley, Mark A
2018-01-01
Although bioinformatics is becoming increasingly central to research in the life sciences, bioinformatics skills and knowledge are not well integrated into undergraduate biology education. This curricular gap prevents biology students from harnessing the full potential of their education, limiting their career opportunities and slowing research innovation. To advance the integration of bioinformatics into life sciences education, a framework of core bioinformatics competencies is needed. To that end, we here report the results of a survey of biology faculty in the United States about teaching bioinformatics to undergraduate life scientists. Responses were received from 1,260 faculty representing institutions in all fifty states with a combined capacity to educate hundreds of thousands of students every year. Results indicate strong, widespread agreement that bioinformatics knowledge and skills are critical for undergraduate life scientists as well as considerable agreement about which skills are necessary. Perceptions of the importance of some skills varied with the respondent's degree of training, time since degree earned, and/or the Carnegie Classification of the respondent's institution. To assess which skills are currently being taught, we analyzed syllabi of courses with bioinformatics content submitted by survey respondents. Finally, we used the survey results, the analysis of the syllabi, and our collective research and teaching expertise to develop a set of bioinformatics core competencies for undergraduate biology students. These core competencies are intended to serve as a guide for institutions as they work to integrate bioinformatics into their life sciences curricula.
Bioinformatics core competencies for undergraduate life sciences education
Wilson Sayres, Melissa A.; Hauser, Charles; Sierk, Michael; Robic, Srebrenka; Rosenwald, Anne G.; Smith, Todd M.; Triplett, Eric W.; Williams, Jason J.; Dinsdale, Elizabeth; Morgan, William R.; Burnette, James M.; Donovan, Samuel S.; Drew, Jennifer C.; Elgin, Sarah C. R.; Fowlks, Edison R.; Galindo-Gonzalez, Sebastian; Goodman, Anya L.; Grandgenett, Nealy F.; Goller, Carlos C.; Jungck, John R.; Newman, Jeffrey D.; Pearson, William; Ryder, Elizabeth F.; Tosado-Acevedo, Rafael; Tapprich, William; Tobin, Tammy C.; Toro-Martínez, Arlín; Welch, Lonnie R.; Wright, Robin; Ebenbach, David; McWilliams, Mindy; Olney, Kimberly C.
2018-01-01
Although bioinformatics is becoming increasingly central to research in the life sciences, bioinformatics skills and knowledge are not well integrated into undergraduate biology education. This curricular gap prevents biology students from harnessing the full potential of their education, limiting their career opportunities and slowing research innovation. To advance the integration of bioinformatics into life sciences education, a framework of core bioinformatics competencies is needed. To that end, we here report the results of a survey of biology faculty in the United States about teaching bioinformatics to undergraduate life scientists. Responses were received from 1,260 faculty representing institutions in all fifty states with a combined capacity to educate hundreds of thousands of students every year. Results indicate strong, widespread agreement that bioinformatics knowledge and skills are critical for undergraduate life scientists as well as considerable agreement about which skills are necessary. Perceptions of the importance of some skills varied with the respondent’s degree of training, time since degree earned, and/or the Carnegie Classification of the respondent’s institution. To assess which skills are currently being taught, we analyzed syllabi of courses with bioinformatics content submitted by survey respondents. Finally, we used the survey results, the analysis of the syllabi, and our collective research and teaching expertise to develop a set of bioinformatics core competencies for undergraduate biology students. These core competencies are intended to serve as a guide for institutions as they work to integrate bioinformatics into their life sciences curricula. PMID:29870542
Bioinformatics in the Netherlands: the value of a nationwide community.
van Gelder, Celia W G; Hooft, Rob W W; van Rijswijk, Merlijn N; van den Berg, Linda; Kok, Ruben G; Reinders, Marcel; Mons, Barend; Heringa, Jaap
2017-09-15
This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures supporting a relatively large Dutch bioinformatics community will be reviewed. We will show that the most valuable resource that we have built over these years is the close-knit national expert community that is well engaged in basic and translational life science research programmes. The Dutch bioinformatics community is accustomed to facing the ever-changing landscape of data challenges and working towards solutions together. In addition, this community is the stable factor on the road towards sustainability, especially in times where existing funding models are challenged and change rapidly. © The Author 2017. Published by Oxford University Press.
E-Learning as a new tool in bioinformatics teaching
Saravanan, Vijayakumar; Shanmughavel, Piramanayagam
2007-01-01
In recent years, virtual learning is growing rapidly. Universities, colleges, and secondary schools are now delivering training and education over the internet. Beside this, resources available over the WWW are huge and understanding the various techniques employed in the field of Bioinformatics is increasingly complex for students during implementation. Here, we discuss its importance in developing and delivering an educational system in Bioinformatics based on e-learning environment. PMID:18292800
Relax with CouchDB--into the non-relational DBMS era of bioinformatics.
Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R
2012-07-01
With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.
Using Next-Generation Sequencing to Explore Genetics and Race in the High School Classroom
ERIC Educational Resources Information Center
Yang, Xinmiao; Hartman, Mark R.; Harrington, Kristin T.; Etson, Candice M.; Fierman, Matthew B.; Slonim, Donna K.; Walt, David R.
2017-01-01
With the development of new sequencing and bioinformatics technologies, concepts relating to personal genomics play an increasingly important role in our society. To promote interest and understanding of sequencing and bioinformatics in the high school classroom, we developed and implemented a laboratory-based teaching module called "The…
InCoB2012 Conference: from biological data to knowledge to technological breakthroughs
2012-01-01
Ten years ago when Asia-Pacific Bioinformatics Network held the first International Conference on Bioinformatics (InCoB) in Bangkok its theme was North-South Networking. At that time InCoB aimed to provide biologists and bioinformatics researchers in the Asia-Pacific region a forum to meet, interact with, and disseminate knowledge about the burgeoning field of bioinformatics. Meanwhile InCoB has evolved into a major regional bioinformatics conference that attracts not only talented and established scientists from the region but increasingly also from East Asia, North America and Europe. Since 2006 InCoB yielded 114 articles in BMC Bioinformatics supplement issues that have been cited nearly 1,000 times to date. In part, these developments reflect the success of bioinformatics education and continuous efforts to integrate and utilize bioinformatics in biotechnology and biosciences in the Asia-Pacific region. A cross-section of research leading from biological data to knowledge and to technological applications, the InCoB2012 theme, is introduced in this editorial. Other highlights included sessions organized by the Pan-Asian Pacific Genome Initiative and a Machine Learning in Immunology competition. InCoB2013 is scheduled for September 18-21, 2013 at Suzhou, China. PMID:23281929
The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics.
de Matos, Paula; Cham, Jennifer A; Cao, Hong; Alcántara, Rafael; Rowland, Francis; Lopez, Rodrigo; Steinbeck, Christoph
2013-03-20
User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users' requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios.For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature.We employed several UCD techniques, including: persona development, interviews, 'canvas sort' card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience.
Agile parallel bioinformatics workflow management using Pwrake.
Mishima, Hiroyuki; Sasaki, Kensaku; Tanaka, Masahiro; Tatebe, Osamu; Yoshiura, Koh-Ichiro
2011-09-08
In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error.Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows.
Agile parallel bioinformatics workflow management using Pwrake
2011-01-01
Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability and maintainability of rakefiles may facilitate sharing workflows among the scientific community. Workflows for GATK and Dindel are available at http://github.com/misshie/Workflows. PMID:21899774
Bellman’s GAP—a language and compiler for dynamic programming in sequence analysis
Sauthoff, Georg; Möhl, Mathias; Janssen, Stefan; Giegerich, Robert
2013-01-01
Motivation: Dynamic programming is ubiquitous in bioinformatics. Developing and implementing non-trivial dynamic programming algorithms is often error prone and tedious. Bellman’s GAP is a new programming system, designed to ease the development of bioinformatics tools based on the dynamic programming technique. Results: In Bellman’s GAP, dynamic programming algorithms are described in a declarative style by tree grammars, evaluation algebras and products formed thereof. This bypasses the design of explicit dynamic programming recurrences and yields programs that are free of subscript errors, modular and easy to modify. The declarative modules are compiled into C++ code that is competitive to carefully hand-crafted implementations. This article introduces the Bellman’s GAP system and its language, GAP-L. It then demonstrates the ease of development and the degree of re-use by creating variants of two common bioinformatics algorithms. Finally, it evaluates Bellman’s GAP as an implementation platform of ‘real-world’ bioinformatics tools. Availability: Bellman’s GAP is available under GPL license from http://bibiserv.cebitec.uni-bielefeld.de/bellmansgap. This Web site includes a repository of re-usable modules for RNA folding based on thermodynamics. Contact: robert@techfak.uni-bielefeld.de Supplementary information: Supplementary data are available at Bioinformatics online PMID:23355290
The 20th anniversary of EMBnet: 20 years of bioinformatics for the Life Sciences community
D'Elia, Domenica; Gisel, Andreas; Eriksson, Nils-Einar; Kossida, Sophia; Mattila, Kimmo; Klucar, Lubos; Bongcam-Rudloff, Erik
2009-01-01
The EMBnet Conference 2008, focusing on 'Leading Applications and Technologies in Bioinformatics', was organized by the European Molecular Biology network (EMBnet) to celebrate its 20th anniversary. Since its foundation in 1988, EMBnet has been working to promote collaborative development of bioinformatics services and tools to serve the European community of molecular biology laboratories. This conference was the first meeting organized by the network that was open to the international scientific community outside EMBnet. The conference covered a broad range of research topics in bioinformatics with a main focus on new achievements and trends in emerging technologies supporting genomics, transcriptomics and proteomics analyses such as high-throughput sequencing and data managing, text and data-mining, ontologies and Grid technologies. Papers selected for publication, in this supplement to BMC Bioinformatics, cover a broad range of the topics treated, providing also an overview of the main bioinformatics research fields that the EMBnet community is involved in. PMID:19534734
Bioinformatics in translational drug discovery.
Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G
2017-08-31
Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).
Saeed, Isaam; Wong, Stephen Q.; Mar, Victoria; Goode, David L.; Caramia, Franco; Doig, Ken; Ryland, Georgina L.; Thompson, Ella R.; Hunter, Sally M.; Halgamuge, Saman K.; Ellul, Jason; Dobrovic, Alexander; Campbell, Ian G.; Papenfuss, Anthony T.; McArthur, Grant A.; Tothill, Richard W.
2014-01-01
Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/. PMID:24752294
Building a bioinformatics community of practice through library education programs.
Moore, Margaret E; Vaughan, K T L; Hayes, Barrie E
2004-01-01
This paper addresses the following questions:What makes the community of practice concept an intriguing framework for developing library services for bioinformatics? What is the campus context and setting? What has been the Health Sciences Library's role in bioinformatics at the University of North Carolina (UNC) Chapel Hill? What are the Health Sciences Library's goals? What services are currently offered? How will these services be evaluated and developed? How can libraries demonstrate their value? Providing library services for an emerging community such as bioinformatics and computational biology presents special challenges for libraries including understanding needs, defining and communicating the library's role, building relationships within the community, preparing staff, and securing funding. Like many academic health sciences libraries, the University of North Carolina (UNC) at Chapel Hill Health Sciences Library is addressing these challenges in the context of its overall mission and goals.
Contribution of bioinformatics prediction in microRNA-based cancer therapeutics.
Banwait, Jasjit K; Bastola, Dhundy R
2015-01-01
Despite enormous efforts, cancer remains one of the most lethal diseases in the world. With the advancement of high throughput technologies massive amounts of cancer data can be accessed and analyzed. Bioinformatics provides a platform to assist biologists in developing minimally invasive biomarkers to detect cancer, and in designing effective personalized therapies to treat cancer patients. Still, the early diagnosis, prognosis, and treatment of cancer are an open challenge for the research community. MicroRNAs (miRNAs) are small non-coding RNAs that serve to regulate gene expression. The discovery of deregulated miRNAs in cancer cells and tissues has led many to investigate the use of miRNAs as potential biomarkers for early detection, and as a therapeutic agent to treat cancer. Here we describe advancements in computational approaches to predict miRNAs and their targets, and discuss the role of bioinformatics in studying miRNAs in the context of human cancer. Published by Elsevier B.V.
Planning bioinformatics workflows using an expert system.
Chen, Xiaoling; Chang, Jeffrey T
2017-04-15
Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. https://github.com/jefftc/changlab. jeffrey.t.chang@uth.tmc.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Planning bioinformatics workflows using an expert system
Chen, Xiaoling; Chang, Jeffrey T.
2017-01-01
Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928
SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.
Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...
Skate Genome Project: Cyber-Enabled Bioinformatics Collaboration
Vincent, J.
2011-01-01
The Skate Genome Project, a pilot project of the North East Cyber infrastructure Consortium, aims to produce a draft genome sequence of Leucoraja erinacea, the Little Skate. The pilot project was designed to also develop expertise in large scale collaborations across the NECC region. An overview of the bioinformatics and infrastructure challenges faced during the first year of the project will be presented. Results to date and lessons learned from the perspective of a bioinformatics core will be highlighted.
Bioinformatics/biostatistics: microarray analysis.
Eichler, Gabriel S
2012-01-01
The quantity and complexity of the molecular-level data generated in both research and clinical settings require the use of sophisticated, powerful computational interpretation techniques. It is for this reason that bioinformatic analysis of complex molecular profiling data has become a fundamental technology in the development of personalized medicine. This chapter provides a high-level overview of the field of bioinformatics and outlines several, classic bioinformatic approaches. The highlighted approaches can be aptly applied to nearly any sort of high-dimensional genomic, proteomic, or metabolomic experiments. Reviewed technologies in this chapter include traditional clustering analysis, the Gene Expression Dynamics Inspector (GEDI), GoMiner (GoMiner), Gene Set Enrichment Analysis (GSEA), and the Learner of Functional Enrichment (LeFE).
ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis
Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas
2016-01-01
Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475
Mulder, Nicola; Schwartz, Russell; Brazas, Michelle D; Brooksbank, Cath; Gaeta, Bruno; Morgan, Sarah L; Pauley, Mark A; Rosenwald, Anne; Rustici, Gabriella; Sierk, Michael; Warnow, Tandy; Welch, Lonnie
2018-02-01
Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans.
Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie
2018-01-01
Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004
González-Nilo, Fernando; Pérez-Acle, Tomás; Guínez-Molinos, Sergio; Geraldo, Daniela A; Sandoval, Claudia; Yévenes, Alejandro; Santos, Leonardo S; Laurie, V Felipe; Mendoza, Hegaly; Cachau, Raúl E
2011-01-01
After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.
Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.
Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru
2016-09-29
Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.
p3d--Python module for structural bioinformatics.
Fufezan, Christian; Specht, Michael
2009-08-21
High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files). p3d's strength arises from the combination of a) very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP) tree, b) set theory and c) functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.
Bellman's GAP--a language and compiler for dynamic programming in sequence analysis.
Sauthoff, Georg; Möhl, Mathias; Janssen, Stefan; Giegerich, Robert
2013-03-01
Dynamic programming is ubiquitous in bioinformatics. Developing and implementing non-trivial dynamic programming algorithms is often error prone and tedious. Bellman's GAP is a new programming system, designed to ease the development of bioinformatics tools based on the dynamic programming technique. In Bellman's GAP, dynamic programming algorithms are described in a declarative style by tree grammars, evaluation algebras and products formed thereof. This bypasses the design of explicit dynamic programming recurrences and yields programs that are free of subscript errors, modular and easy to modify. The declarative modules are compiled into C++ code that is competitive to carefully hand-crafted implementations. This article introduces the Bellman's GAP system and its language, GAP-L. It then demonstrates the ease of development and the degree of re-use by creating variants of two common bioinformatics algorithms. Finally, it evaluates Bellman's GAP as an implementation platform of 'real-world' bioinformatics tools. Bellman's GAP is available under GPL license from http://bibiserv.cebitec.uni-bielefeld.de/bellmansgap. This Web site includes a repository of re-usable modules for RNA folding based on thermodynamics.
Bioinformatic pipelines in Python with Leaf
2013-01-01
Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315
Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes
Rashid, Mamunur; Robles-Espinoza, Carla Daniela; Rust, Alistair G.; Adams, David J.
2013-01-01
Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application. Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ Contact: da1@sanger.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23803469
PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols.
Kanterakis, Alexandros; Kuiper, Joël; Potamias, George; Swertz, Morris A
2015-01-01
Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License.
XML schemas for common bioinformatic data types and their application in workflow systems
Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert
2006-01-01
Background Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Results Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at , the BioDOM library can be obtained at . Conclusion The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios. PMID:17087823
Kim, Jihye; Vasu, Vihas T; Mishra, Rangnath; Singleton, Katherine R; Yoo, Minjae; Leach, Sonia M; Farias-Hesson, Eveline; Mason, Robert J; Kang, Jaewoo; Ramamoorthy, Preveen; Kern, Jeffrey A; Heasley, Lynn E; Finigan, James H; Tan, Aik Choon
2014-09-01
Non-small-cell lung cancer (NSCLC) is the leading cause of cancer death in the United States. Targeted tyrosine kinase inhibitors (TKIs) directed against the epidermal growth factor receptor (EGFR) have been widely and successfully used in treating NSCLC patients with activating EGFR mutations. Unfortunately, the duration of response is short-lived, and all patients eventually relapse by acquiring resistance mechanisms. We performed an integrative systems biology approach to determine essential kinases that drive EGFR-TKI resistance in cancer cell lines. We used a series of bioinformatics methods to analyze and integrate the functional genetics screen and RNA-seq data to identify a set of kinases that are critical in survival and proliferation in these TKI-resistant lines. By connecting the essential kinases to compounds using a novel kinase connectivity map (K-Map), we identified and validated bosutinib as an effective compound that could inhibit proliferation and induce apoptosis in TKI-resistant lines. A rational combination of bosutinib and gefitinib showed additive and synergistic effects in cancer cell lines resistant to EGFR TKI alone. We have demonstrated a bioinformatics-driven discovery roadmap for drug repurposing and development in overcoming resistance in EGFR-mutant NSCLC, which could be generalized to other cancer types in the era of personalized medicine. K-Map can be accessible at: http://tanlab.ucdenver.edu/kMap. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The Enzyme Portal: a case study in applying user-centred design methods in bioinformatics
2013-01-01
User-centred design (UCD) is a type of user interface design in which the needs and desires of users are taken into account at each stage of the design process for a service or product; often for software applications and websites. Its goal is to facilitate the design of software that is both useful and easy to use. To achieve this, you must characterise users’ requirements, design suitable interactions to meet their needs, and test your designs using prototypes and real life scenarios. For bioinformatics, there is little practical information available regarding how to carry out UCD in practice. To address this we describe a complete, multi-stage UCD process used for creating a new bioinformatics resource for integrating enzyme information, called the Enzyme Portal (http://www.ebi.ac.uk/enzymeportal). This freely-available service mines and displays data about proteins with enzymatic activity from public repositories via a single search, and includes biochemical reactions, biological pathways, small molecule chemistry, disease information, 3D protein structures and relevant scientific literature. We employed several UCD techniques, including: persona development, interviews, ‘canvas sort’ card sorting, user workflows, usability testing and others. Our hope is that this case study will motivate the reader to apply similar UCD approaches to their own software design for bioinformatics. Indeed, we found the benefits included more effective decision-making for design ideas and technologies; enhanced team-working and communication; cost effectiveness; and ultimately a service that more closely meets the needs of our target audience. PMID:23514033
Cellular automata and its applications in protein bioinformatics.
Xiao, Xuan; Wang, Pu; Chou, Kuo-Chen
2011-09-01
With the explosion of protein sequences generated in the postgenomic era, it is highly desirable to develop high-throughput tools for rapidly and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. The knowledge thus obtained can help us timely utilize these newly found protein sequences for both basic research and drug discovery. Many bioinformatics tools have been developed by means of machine learning methods. This review is focused on the applications of a new kind of science (cellular automata) in protein bioinformatics. A cellular automaton (CA) is an open, flexible and discrete dynamic model that holds enormous potentials in modeling complex systems, in spite of the simplicity of the model itself. Researchers, scientists and practitioners from different fields have utilized cellular automata for visualizing protein sequences, investigating their evolution processes, and predicting their various attributes. Owing to its impressive power, intuitiveness and relative simplicity, the CA approach has great potential for use as a tool for bioinformatics.
Combining medical informatics and bioinformatics toward tools for personalized medicine.
Sarachan, B D; Simmons, M K; Subramanian, P; Temkin, J M
2003-01-01
Key bioinformatics and medical informatics research areas need to be identified to advance knowledge and understanding of disease risk factors and molecular disease pathology in the 21 st century toward new diagnoses, prognoses, and treatments. Three high-impact informatics areas are identified: predictive medicine (to identify significant correlations within clinical data using statistical and artificial intelligence methods), along with pathway informatics and cellular simulations (that combine biological knowledge with advanced informatics to elucidate molecular disease pathology). Initial predictive models have been developed for a pilot study in Huntington's disease. An initial bioinformatics platform has been developed for the reconstruction and analysis of pathways, and work has begun on pathway simulation. A bioinformatics research program has been established at GE Global Research Center as an important technology toward next generation medical diagnostics. We anticipate that 21 st century medical research will be a combination of informatics tools with traditional biology wet lab research, and that this will translate to increased use of informatics techniques in the clinic.
BioSmalltalk: a pure object system and library for bioinformatics.
Morales, Hernán F; Giovambattista, Guillermo
2013-09-15
We have developed BioSmalltalk, a new environment system for pure object-oriented bioinformatics programming. Adaptive end-user programming systems tend to become more important for discovering biological knowledge, as is demonstrated by the emergence of open-source programming toolkits for bioinformatics in the past years. Our software is intended to bridge the gap between bioscientists and rapid software prototyping while preserving the possibility of scaling to whole-system biology applications. BioSmalltalk performs better in terms of execution time and memory usage than Biopython and BioPerl for some classical situations. BioSmalltalk is cross-platform and freely available (MIT license) through the Google Project Hosting at http://code.google.com/p/biosmalltalk hernan.morales@gmail.com Supplementary data are available at Bioinformatics online.
A toolbox for developing bioinformatics software
Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.
2012-01-01
Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787
An integrative computational approach for prioritization of genomic variants
Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; ...
2014-12-15
An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidatemore » genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. This study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.« less
Watson-Haigh, Nathan S; Shang, Catherine A; Haimel, Matthias; Kostadima, Myrto; Loos, Remco; Deshpande, Nandan; Duesing, Konsta; Li, Xi; McGrath, Annette; McWilliam, Sean; Michnowicz, Simon; Moolhuijzen, Paula; Quenette, Steve; Revote, Jerico Nico De Leon; Tyagi, Sonika; Schneider, Maria V
2013-09-01
The widespread adoption of high-throughput next-generation sequencing (NGS) technology among the Australian life science research community is highlighting an urgent need to up-skill biologists in tools required for handling and analysing their NGS data. There is currently a shortage of cutting-edge bioinformatics training courses in Australia as a consequence of a scarcity of skilled trainers with time and funding to develop and deliver training courses. To address this, a consortium of Australian research organizations, including Bioplatforms Australia, the Commonwealth Scientific and Industrial Research Organisation and the Australian Bioinformatics Network, have been collaborating with EMBL-EBI training team. A group of Australian bioinformaticians attended the train-the-trainer workshop to improve training skills in developing and delivering bioinformatics workshop curriculum. A 2-day NGS workshop was jointly developed to provide hands-on knowledge and understanding of typical NGS data analysis workflows. The road show-style workshop was successfully delivered at five geographically distant venues in Australia using the newly established Australian NeCTAR Research Cloud. We highlight the challenges we had to overcome at different stages from design to delivery, including the establishment of an Australian bioinformatics training network and the computing infrastructure and resource development. A virtual machine image, workshop materials and scripts for configuring a machine with workshop contents have all been made available under a Creative Commons Attribution 3.0 Unported License. This means participants continue to have convenient access to an environment they had become familiar and bioinformatics trainers are able to access and reuse these resources.
Watson-Haigh, Nathan S.; Shang, Catherine A.; Haimel, Matthias; Kostadima, Myrto; Loos, Remco; Deshpande, Nandan; Duesing, Konsta; Li, Xi; McGrath, Annette; McWilliam, Sean; Michnowicz, Simon; Moolhuijzen, Paula; Quenette, Steve; Revote, Jerico Nico De Leon; Tyagi, Sonika; Schneider, Maria V.
2013-01-01
The widespread adoption of high-throughput next-generation sequencing (NGS) technology among the Australian life science research community is highlighting an urgent need to up-skill biologists in tools required for handling and analysing their NGS data. There is currently a shortage of cutting-edge bioinformatics training courses in Australia as a consequence of a scarcity of skilled trainers with time and funding to develop and deliver training courses. To address this, a consortium of Australian research organizations, including Bioplatforms Australia, the Commonwealth Scientific and Industrial Research Organisation and the Australian Bioinformatics Network, have been collaborating with EMBL-EBI training team. A group of Australian bioinformaticians attended the train-the-trainer workshop to improve training skills in developing and delivering bioinformatics workshop curriculum. A 2-day NGS workshop was jointly developed to provide hands-on knowledge and understanding of typical NGS data analysis workflows. The road show–style workshop was successfully delivered at five geographically distant venues in Australia using the newly established Australian NeCTAR Research Cloud. We highlight the challenges we had to overcome at different stages from design to delivery, including the establishment of an Australian bioinformatics training network and the computing infrastructure and resource development. A virtual machine image, workshop materials and scripts for configuring a machine with workshop contents have all been made available under a Creative Commons Attribution 3.0 Unported License. This means participants continue to have convenient access to an environment they had become familiar and bioinformatics trainers are able to access and reuse these resources. PMID:23543352
FY02 CBNP Annual Report Input: Bioinformatics Support for CBNP Research and Deployments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Slezak, T; Wolinsky, M
2002-10-31
The events of FY01 dynamically reprogrammed the objectives of the CBNP bioinformatics support team, to meet rapidly-changing Homeland Defense needs and requests from other agencies for assistance: Use computational techniques to determine potential unique DNA signature candidates for microbial and viral pathogens of interest to CBNP researcher and to our collaborating partner agencies such as the Centers for Disease Control and Prevention (CDC), U.S. Department of Agriculture (USDA), Department of Defense (DOD), and Food and Drug Administration (FDA). Develop effective electronic screening measures for DNA signatures to reduce the cost and time of wet-bench screening. Build a comprehensive system formore » tracking the development and testing of DNA signatures. Build a chain-of-custody sample tracking system for field deployment of the DNA signatures as part of the BASIS project. Provide computational tools for use by CBNP Biological Foundations researchers.« less
A Bioinformatics Facility for NASA
NASA Technical Reports Server (NTRS)
Schweighofer, Karl; Pohorille, Andrew
2006-01-01
Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.
Chimusa, Emile R; Mbiyavanga, Mamana; Masilela, Velaphi; Kumuthini, Judit
2015-11-01
A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The "omics" fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP). Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.
Kang, Yuan; Dong, Xinran; Zhou, Qiongjie; Zhang, Ying; Cheng, Yan; Hu, Rong; Su, Cuihong; Jin, Hong; Liu, Xiaohui; Ma, Duan; Tian, Weidong; Li, Xiaotian
2012-03-01
This study aimed to identify candidate protein biomarkers from maternal serum for Down syndrome (DS) by integrated proteomic and bioinformatics analysis. A pregnancy DS group of 18 women and a control group with the same number were prepared, and the maternal serum proteins were analyzed by isobaric tags for relative and absolute quantitation and mass spectrometry, to identify DS differentially expressed maternal serum proteins (DS-DEMSPs). Comprehensive bioinformatics analysis was then employed to analyze DS-DEMSPs both in this paper and seven related publications. Down syndrome differentially expressed maternal serum proteins from different studies are significantly enriched with common Gene Ontology functions, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, transcription factor binding sites, and Pfam protein domains, However, the DS-DEMSPs are less functionally related to known DS-related genes. These evidences suggest that common molecular mechanisms induced by secondary effects may be present upon DS carrying. A simple scoring scheme revealed Alpha-2-macroglobulin, Apolipoprotein A1, Apolipoprotein E, Complement C1s subcomponent, Complement component 5, Complement component 8, alpha polypeptide, Complement component 8, beta polypeptide and Fibronectin as potential DS biomarkers. The integration of proteomics and bioinformatics studies provides a novel approach to develop new prenatal screening methods for noninvasive yet accurate diagnosis of DS. Copyright © 2012 John Wiley & Sons, Ltd.
Food Safety in the Age of Next Generation Sequencing, Bioinformatics, and Open Data Access.
Taboada, Eduardo N; Graham, Morag R; Carriço, João A; Van Domselaar, Gary
2017-01-01
Public health labs and food regulatory agencies globally are embracing whole genome sequencing (WGS) as a revolutionary new method that is positioned to replace numerous existing diagnostic and microbial typing technologies with a single new target: the microbial draft genome. The ability to cheaply generate large amounts of microbial genome sequence data, combined with emerging policies of food regulatory and public health institutions making their microbial sequences increasingly available and public, has served to open up the field to the general scientific community. This open data access policy shift has resulted in a proliferation of data being deposited into sequence repositories and of novel bioinformatics software designed to analyze these vast datasets. There also has been a more recent drive for improved data sharing to achieve more effective global surveillance, public health and food safety. Such developments have heightened the need for enhanced analytical systems in order to process and interpret this new type of data in a timely fashion. In this review we outline the emergence of genomics, bioinformatics and open data in the context of food safety. We also survey major efforts to translate genomics and bioinformatics technologies out of the research lab and into routine use in modern food safety labs. We conclude by discussing the challenges and opportunities that remain, including those expected to play a major role in the future of food safety science.
Educational websites--Bioinformatics Tools II.
Lomberk, Gwen
2009-01-01
In this issue, the highlighted websites are a continuation of a series of educational websites; this one in particular from a couple of years ago, Bioinformatics Tools [Pancreatology 2005;5:314-315]. These include sites that are valuable resources for many research needs in genomics and proteomics. Bioinformatics has become a laboratory tool to map sequences to databases, develop models of molecular interactions, evaluate structural compatibilities, describe differences between normal and disease-associated DNA, identify conserved motifs within proteins, and chart extensive signaling networks, all in silico. Copyright 2008 S. Karger AG, Basel and IAP.
de la Calle, Guillermo; García-Remesal, Miguel; Chiesa, Stefano; de la Iglesia, Diana; Maojo, Victor
2009-10-07
The rapid evolution of Internet technologies and the collaborative approaches that dominate the field have stimulated the development of numerous bioinformatics resources. To address this new framework, several initiatives have tried to organize these services and resources. In this paper, we present the BioInformatics Resource Inventory (BIRI), a new approach for automatically discovering and indexing available public bioinformatics resources using information extracted from the scientific literature. The index generated can be automatically updated by adding additional manuscripts describing new resources. We have developed web services and applications to test and validate our approach. It has not been designed to replace current indexes but to extend their capabilities with richer functionalities. We developed a web service to provide a set of high-level query primitives to access the index. The web service can be used by third-party web services or web-based applications. To test the web service, we created a pilot web application to access a preliminary knowledge base of resources. We tested our tool using an initial set of 400 abstracts. Almost 90% of the resources described in the abstracts were correctly classified. More than 500 descriptions of functionalities were extracted. These experiments suggest the feasibility of our approach for automatically discovering and indexing current and future bioinformatics resources. Given the domain-independent characteristics of this tool, it is currently being applied by the authors in other areas, such as medical nanoinformatics. BIRI is available at http://edelman.dia.fi.upm.es/biri/.
A quick guide for building a successful bioinformatics community.
Budd, Aidan; Corpas, Manuel; Brazas, Michelle D; Fuller, Jonathan C; Goecks, Jeremy; Mulder, Nicola J; Michaut, Magali; Ouellette, B F Francis; Pawlik, Aleksandra; Blomberg, Niklas
2015-02-01
"Scientific community" refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop "The 'How To Guide' for Establishing a Successful Bioinformatics Network" at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB).
Agents in bioinformatics, computational and systems biology.
Merelli, Emanuela; Armano, Giuliano; Cannata, Nicola; Corradini, Flavio; d'Inverno, Mark; Doms, Andreas; Lord, Phillip; Martin, Andrew; Milanesi, Luciano; Möller, Steffen; Schroeder, Michael; Luck, Michael
2007-01-01
The adoption of agent technologies and multi-agent systems constitutes an emerging area in bioinformatics. In this article, we report on the activity of the Working Group on Agents in Bioinformatics (BIOAGENTS) founded during the first AgentLink III Technical Forum meeting on the 2nd of July, 2004, in Rome. The meeting provided an opportunity for seeding collaborations between the agent and bioinformatics communities to develop a different (agent-based) approach of computational frameworks both for data analysis and management in bioinformatics and for systems modelling and simulation in computational and systems biology. The collaborations gave rise to applications and integrated tools that we summarize and discuss in context of the state of the art in this area. We investigate on future challenges and argue that the field should still be explored from many perspectives ranging from bio-conceptual languages for agent-based simulation, to the definition of bio-ontology-based declarative languages to be used by information agents, and to the adoption of agents for computational grids.
Bioinformatics for Undergraduates: Steps toward a Quantitative Bioscience Curriculum
ERIC Educational Resources Information Center
Chapman, Barbara S.; Christmann, James L.; Thatcher, Eileen F.
2006-01-01
We describe an innovative bioinformatics course developed under grants from the National Science Foundation and the California State University Program in Research and Education in Biotechnology for undergraduate biology students. The project has been part of a continuing effort to offer students classroom experiences focused on principles and…
Incorporation of Bioinformatics Exercises into the Undergraduate Biochemistry Curriculum
ERIC Educational Resources Information Center
Feig, Andrew L.; Jabri, Evelyn
2002-01-01
The field of bioinformatics is developing faster than most biochemistry textbooks can adapt. Supplementing the undergraduate biochemistry curriculum with data-mining exercises is an ideal way to expose the students to the common databases and tools that take advantage of this vast repository of biochemical information. An integrated collection of…
Exploring DNA Structure with Cn3D
ERIC Educational Resources Information Center
Porter, Sandra G.; Day, Joseph; McCarty, Richard E.; Shearn, Allen; Shingles, Richard; Fletcher, Linnea; Murphy, Stephanie; Pearlman, Rebecca
2007-01-01
Researchers in the field of bioinformatics have developed a number of analytical programs and databases that are increasingly important for advancing biological research. Because bioinformatics programs are used to analyze, visualize, and/or compare biological data, it is likely that the use of these programs will have a positive impact on biology…
2005-01-01
The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of Gagne's Conditions of Learning instructional design theory. This theory, although first published in the early 1970s, is still fundamental in instructional design and instructional technology. First, top-level as well as prerequisite learning objectives for a microarray analysis workshop and a primer design workshop were defined. Then a hierarchy of objectives for each workshop was created. Hands-on tutorials were designed to meet these objectives. Finally, events of learning proposed by Gagne's theory were incorporated into the hands-on tutorials. The resultant manuals were tested on a small number of trainees, revised, and applied in 1-day bioinformatics workshops. Based on this experience and on observations made during the workshops, we conclude that Gagne's Conditions of Learning instructional design theory provides a useful framework for developing bioinformatics training, but may not be optimal as a method for teaching it. PMID:16220141
XML schemas for common bioinformatic data types and their application in workflow systems.
Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert
2006-11-06
Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data--therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.
EPIGEN-Brazil Initiative resources: a Latin American imputation panel and the Scientific Workflow.
Magalhães, Wagner C S; Araujo, Nathalia M; Leal, Thiago P; Araujo, Gilderlanio S; Viriato, Paula J S; Kehdy, Fernanda S; Costa, Gustavo N; Barreto, Mauricio L; Horta, Bernardo L; Lima-Costa, Maria Fernanda; Pereira, Alexandre C; Tarazona-Santos, Eduardo; Rodrigues, Maíra R
2018-06-14
EPIGEN-Brazil is one of the largest Latin American initiatives at the interface of human genomics, public health, and computational biology. Here, we present two resources to address two challenges to the global dissemination of precision medicine and the development of the bioinformatics know-how to support it. To address the underrepresentation of non-European individuals in human genome diversity studies, we present the EPIGEN-5M+1KGP imputation panel-the fusion of the public 1000 Genomes Project (1KGP) Phase 3 imputation panel with haplotypes derived from the EPIGEN-5M data set (a product of the genotyping of 4.3 million SNPs in 265 admixed individuals from the EPIGEN-Brazil Initiative). When we imputed a target SNPs data set (6487 admixed individuals genotyped for 2.2 million SNPs from the EPIGEN-Brazil project) with the EPIGEN-5M+1KGP panel, we gained 140,452 more SNPs in total than when using the 1KGP Phase 3 panel alone and 788,873 additional high confidence SNPs ( info score ≥ 0.8). Thus, the major effect of the inclusion of the EPIGEN-5M data set in this new imputation panel is not only to gain more SNPs but also to improve the quality of imputation. To address the lack of transparency and reproducibility of bioinformatics protocols, we present a conceptual Scientific Workflow in the form of a website that models the scientific process (by including publications, flowcharts, masterscripts, documents, and bioinformatics protocols), making it accessible and interactive. Its applicability is shown in the context of the development of our EPIGEN-5M+1KGP imputation panel. The Scientific Workflow also serves as a repository of bioinformatics resources. © 2018 Magalhães et al.; Published by Cold Spring Harbor Laboratory Press.
2009-01-01
Background The rapid advancement of computer and information technology in recent years has resulted in the rise of e-learning technologies to enhance and complement traditional classroom teaching in many fields, including bioinformatics. This paper records the experience of implementing e-learning technology to support problem-based learning (PBL) in the teaching of two undergraduate bioinformatics classes in the National University of Singapore. Results Survey results further established the efficiency and suitability of e-learning tools to supplement PBL in bioinformatics education. 63.16% of year three bioinformatics students showed a positive response regarding the usefulness of the Learning Activity Management System (LAMS) e-learning tool in guiding the learning and discussion process involved in PBL and in enhancing the learning experience by breaking down PBL activities into a sequential workflow. On the other hand, 89.81% of year two bioinformatics students indicated that their revision process was positively impacted with the use of LAMS for guiding the learning process, while 60.19% agreed that the breakdown of activities into a sequential step-by-step workflow by LAMS enhances the learning experience Conclusion We show that e-learning tools are useful for supplementing PBL in bioinformatics education. The results suggest that it is feasible to develop and adopt e-learning tools to supplement a variety of instructional strategies in the future. PMID:19958511
Promoting synergistic research and education in genomics and bioinformatics.
Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping
2008-01-01
Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and scientific achievements by bridging these two very important disciplines into an interactive and attractive forum. Keeping this objective in mind, Biocomp 2007 aims to promote interdisciplinary and multidisciplinary education and research. 25 high quality peer-reviewed papers were selected from 400+ submissions for this supplementary issue of BMC Genomics. Those papers contributed to a wide-range of important research fields including gene expression data analysis and applications, high-throughput genome mapping, sequence analysis, gene regulation, protein structure prediction, disease prediction by machine learning techniques, systems biology, database and biological software development. We always encourage participants submitting proposals for genomics sessions, special interest research sessions, workshops and tutorials to Professor Hamid R. Arabnia (hra@cs.uga.edu) in order to ensure that Biocomp continuously plays the leadership role in promoting inter/multidisciplinary research and education in the fields. Biocomp received top conference ranking with a high score of 0.95/1.00. Biocomp is academically co-sponsored by the International Society of Intelligent Biological Medicine and the Research Laboratories and Centers of Harvard University--Massachusetts Institute of Technology, Indiana University--Purdue University, Georgia Tech--Emory University, UIUC, UCLA, Columbia University, University of Texas at Austin and University of Iowa etc. Biocomp--Worldcomp brings leading scientists together across the nation and all over the world and aims to promote synergistic components such as keynote lectures, special interest sessions, workshops and tutorials in response to the advances of cutting-edge research.
SOBA: sequence ontology bioinformatics analysis.
Moore, Barry; Fan, Guozhen; Eilbeck, Karen
2010-07-01
The advent of cheaper, faster sequencing technologies has pushed the task of sequence annotation from the exclusive domain of large-scale multi-national sequencing projects to that of research laboratories and small consortia. The bioinformatics burden placed on these laboratories, some with very little programming experience can be daunting. Fortunately, there exist software libraries and pipelines designed with these groups in mind, to ease the transition from an assembled genome to an annotated and accessible genome resource. We have developed the Sequence Ontology Bioinformatics Analysis (SOBA) tool to provide a simple statistical and graphical summary of an annotated genome. We envisage its use during annotation jamborees, genome comparison and for use by developers for rapid feedback during annotation software development and testing. SOBA also provides annotation consistency feedback to ensure correct use of terminology within annotations, and guides users to add new terms to the Sequence Ontology when required. SOBA is available at http://www.sequenceontology.org/cgi-bin/soba.cgi.
Katayama, Toshiaki; Arakawa, Kazuharu; Nakao, Mitsuteru; Ono, Keiichiro; Aoki-Kinoshita, Kiyoko F; Yamamoto, Yasunori; Yamaguchi, Atsuko; Kawashima, Shuichi; Chun, Hong-Woo; Aerts, Jan; Aranda, Bruno; Barboza, Lord Hendrix; Bonnal, Raoul Jp; Bruskiewich, Richard; Bryne, Jan C; Fernández, José M; Funahashi, Akira; Gordon, Paul Mk; Goto, Naohisa; Groscurth, Andreas; Gutteridge, Alex; Holland, Richard; Kano, Yoshinobu; Kawas, Edward A; Kerhornou, Arnaud; Kibukawa, Eri; Kinjo, Akira R; Kuhn, Michael; Lapp, Hilmar; Lehvaslaiho, Heikki; Nakamura, Hiroyuki; Nakamura, Yasukazu; Nishizawa, Tatsuya; Nobata, Chikashi; Noguchi, Tamotsu; Oinn, Thomas M; Okamoto, Shinobu; Owen, Stuart; Pafilis, Evangelos; Pocock, Matthew; Prins, Pjotr; Ranzinger, René; Reisinger, Florian; Salwinski, Lukasz; Schreiber, Mark; Senger, Martin; Shigemoto, Yasumasa; Standley, Daron M; Sugawara, Hideaki; Tashiro, Toshiyuki; Trelles, Oswaldo; Vos, Rutger A; Wilkinson, Mark D; York, William; Zmasek, Christian M; Asai, Kiyoshi; Takagi, Toshihisa
2010-08-21
Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.
2010-01-01
Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies. PMID:20727200
Implementation and Assessment of a Molecular Biology and Bioinformatics Undergraduate Degree Program
ERIC Educational Resources Information Center
Pham, Daphne Q. -D.; Higgs, David C.; Statham, Anne; Schleiter, Mary Kay
2008-01-01
The Department of Biological Sciences at the University of Wisconsin-Parkside has developed and implemented an innovative, multidisciplinary undergraduate curriculum in Molecular Biology and Bioinformatics (MBB). The objective of the MBB program is to give students a hands-on facility with molecular biology theories and laboratory techniques, an…
Learning Genetics through an Authentic Research Simulation in Bioinformatics
ERIC Educational Resources Information Center
Gelbart, Hadas; Yarden, Anat
2006-01-01
Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…
Ramping up to the Biology Workbench: A Multi-Stage Approach to Bioinformatics Education
ERIC Educational Resources Information Center
Greene, Kathleen; Donovan, Sam
2005-01-01
In the process of designing and field-testing bioinformatics curriculum materials, we have adopted a three-stage, progressive model that emphasizes collaborative scientific inquiry. The elements of the model include: (1) context setting, (2) introduction to concepts, processes, and tools, and (3) development of competent use of technologically…
Bioinformatics on the cloud computing platform Azure.
Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P
2014-01-01
We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development.
Bioinformatics on the Cloud Computing Platform Azure
Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.
2014-01-01
We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811
Teaching Bioinformatics in Concert
Goodman, Anya L.; Dekhtyar, Alex
2014-01-01
Can biology students without programming skills solve problems that require computational solutions? They can if they learn to cooperate effectively with computer science students. The goal of the in-concert teaching approach is to introduce biology students to computational thinking by engaging them in collaborative projects structured around the software development process. Our approach emphasizes development of interdisciplinary communication and collaboration skills for both life science and computer science students. PMID:25411792
"Extreme Programming" in a Bioinformatics Class
ERIC Educational Resources Information Center
Kelley, Scott; Alger, Christianna; Deutschman, Douglas
2009-01-01
The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…
Chattopadhyay, Ansuman; Tannery, Nancy Hrinya; Silverman, Deborah A. L.; Bergen, Phillip; Epstein, Barbara A.
2006-01-01
Setting: In summer 2002, the Health Sciences Library System (HSLS) at the University of Pittsburgh initiated an information service in molecular biology and genetics to assist researchers with identifying and utilizing bioinformatics tools. Program Components: This novel information service comprises hands-on training workshops and consultation on the use of bioinformatics tools. The HSLS also provides an electronic portal and networked access to public and commercial molecular biology databases and software packages. Evaluation Mechanisms: Researcher feedback gathered during the first three years of workshops and individual consultation indicate that the information service is meeting user needs. Next Steps/Future Directions: The service's workshop offerings will expand to include emerging bioinformatics topics. A frequently asked questions database is also being developed to reuse advice on complex bioinformatics questions. PMID:16888665
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, Ronald C.
Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBasemore » project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.« less
Clinical Bioinformatics: challenges and opportunities
2012-01-01
Background Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of -omics information. Conclusions Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput "-omics" technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research. PMID:23095472
A Quick Guide for Building a Successful Bioinformatics Community
Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas
2015-01-01
“Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371
Biowep: a workflow enactment portal for bioinformatics applications.
Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano
2007-03-08
The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics - LITBIO.
Biowep: a workflow enactment portal for bioinformatics applications
Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano
2007-01-01
Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of effective workflows can significantly improve automation of in-silico analysis. Biowep is available for interested researchers as a reference portal. They are invited to submit their workflows to the workflow repository. Biowep is further being developed in the sphere of the Laboratory of Interdisciplinary Technologies in Bioinformatics – LITBIO. PMID:17430563
The GMOD Drupal bioinformatic server framework.
Papanicolaou, Alexie; Heckel, David G
2010-12-15
Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.
Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai
2017-11-23
The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.
Bioinformatic approaches to interrogating vitamin D receptor signaling.
Campbell, Moray J
2017-09-15
Bioinformatics applies unbiased approaches to develop statistically-robust insight into health and disease. At the global, or "20,000 foot" view bioinformatic analyses of vitamin D receptor (NR1I1/VDR) signaling can measure where the VDR gene or protein exerts a genome-wide significant impact on biology; VDR is significantly implicated in bone biology and immune systems, but not in cancer. With a more VDR-centric, or "2000 foot" view, bioinformatic approaches can interrogate events downstream of VDR activity. Integrative approaches can combine VDR ChIP-Seq in cell systems where significant volumes of publically available data are available. For example, VDR ChIP-Seq studies can be combined with genome-wide association studies to reveal significant associations to immune phenotypes. Similarly, VDR ChIP-Seq can be combined with data from Cancer Genome Atlas (TCGA) to infer the impact of VDR target genes in cancer progression. Therefore, bioinformatic approaches can reveal what aspects of VDR downstream networks are significantly related to disease or phenotype. Copyright © 2017 The Author. Published by Elsevier B.V. All rights reserved.
Pitassi, Claudio; Gonçalves, Antonio Augusto; Moreno Júnior, Valter de Assis
2014-01-01
The scope of this article is to identify and analyze the factors that influence the adoption of ICT tools in experiments with bioinformatics at the Brazilian Cancer Institute (INCA). It involves a descriptive and exploratory qualitative field study. Evidence was collected mainly based on in-depth interviews with the management team at the Research Center and the IT Division. The answers were analyzed using the categorical content method. The categories were selected from the scientific literature and consolidated in the Technology-Organization-Environment (TOE) framework created for this study. The model proposed made it possible to demonstrate how the factors selected impacted INCA´s adoption of bioinformatics systems and tools, contributing to the investigation of two critical areas for the development of the health industry in Brazil, namely technological innovation and bioinformatics. Based on the evidence collected, a research question was posed: to what extent can the alignment of the factors related to the adoption of ICT tools in experiments with bioinformatics increase the innovation capacity of a Brazilian biopharmaceutical organization?
ERIC Educational Resources Information Center
Grunwald, Sandra K.; Krueger, Katherine J.
2008-01-01
Laboratory exercises, which utilize alkaline phosphatase as a model enzyme, have been developed and used extensively in undergraduate biochemistry courses to illustrate enzyme steady-state kinetics. A bioinformatics laboratory exercise for the biochemistry laboratory, which complements the traditional alkaline phosphatase kinetics exercise, was…
Using Kepler for Tool Integration in Microarray Analysis Workflows.
Gan, Zhuohui; Stowe, Jennifer C; Altintas, Ilkay; McCulloch, Andrew D; Zambon, Alexander C
Increasing numbers of genomic technologies are leading to massive amounts of genomic data, all of which requires complex analysis. More and more bioinformatics analysis tools are being developed by scientist to simplify these analyses. However, different pipelines have been developed using different software environments. This makes integrations of these diverse bioinformatics tools difficult. Kepler provides an open source environment to integrate these disparate packages. Using Kepler, we integrated several external tools including Bioconductor packages, AltAnalyze, a python-based open source tool, and R-based comparison tool to build an automated workflow to meta-analyze both online and local microarray data. The automated workflow connects the integrated tools seamlessly, delivers data flow between the tools smoothly, and hence improves efficiency and accuracy of complex data analyses. Our workflow exemplifies the usage of Kepler as a scientific workflow platform for bioinformatics pipelines.
Tong, Weida; Harris, Stephen C; Fang, Hong; Shi, Leming; Perkins, Roger; Goodsaid, Federico; Frueh, Felix W
2007-01-01
Pharmacogenomics (PGx) is identified in the FDA Critical Path document as a major opportunity for advancing medical product development and personalized medicine. An integrated bioinformatics infrastructure for use in FDA data review is crucial to realize the benefits of PGx for public health. We have developed an integrated bioinformatics tool, called ArrayTrack, for managing, analyzing and interpreting genomic and other biomarker data (e.g. proteomic and metabolomic data). ArrayTrack is a highly flexible and robust software platform, which allows evolving with technological advances and changing user needs. ArrayTrack is used in the routine review of genomic data submitted to the FDA; here, three hypothetical examples of its use in the Voluntary eXploratory Data Submission (VXDS) program are illustrated.: © Published by Elsevier Ltd.
Influenza Research Database: An integrated bioinformatics resource for influenza virus research.
Zhang, Yun; Aevermann, Brian D; Anderson, Tavis K; Burke, David F; Dauphin, Gwenaelle; Gu, Zhiping; He, Sherry; Kumar, Sanjeev; Larsen, Christopher N; Lee, Alexandra J; Li, Xiaomei; Macken, Catherine; Mahaffey, Colin; Pickett, Brett E; Reardon, Brian; Smith, Thomas; Stewart, Lucy; Suloway, Christian; Sun, Guangyu; Tong, Lei; Vincent, Amy L; Walters, Bryan; Zaremba, Sam; Zhao, Hongtao; Zhou, Liwei; Zmasek, Christian; Klem, Edward B; Scheuermann, Richard H
2017-01-04
The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics and therapeutics against influenza virus by providing a comprehensive collection of influenza-related data integrated from various sources, a growing suite of analysis and visualization tools for data mining and hypothesis generation, personal workbench spaces for data storage and sharing, and active user community support. Here, we describe the recent improvements in IRD including the use of cloud and high performance computing resources, analysis and visualization of user-provided sequence data with associated metadata, predictions of novel variant proteins, annotations of phenotype-associated sequence markers and their predicted phenotypic effects, hemagglutinin (HA) clade classifications, an automated tool for HA subtype numbering conversion, linkouts to disease event data and the addition of host factor and antiviral drug components. All data and tools are freely available without restriction from the IRD website at https://www.fludb.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Bioinformatics: perspectives for the future.
Costa, Luciano da Fontoura
2004-12-30
I give here a very personal perspective of Bioinformatics and its future, starting by discussing the origin of the term (and area) of bioinformatics and proceeding by trying to foresee the development of related issues, including pattern recognition/data mining, the need to reintegrate biology, the potential of complex networks as a powerful and flexible framework for bioinformatics and the interplay between bio- and neuroinformatics. Human resource formation and market perspective are also addressed. Given the complexity and vastness of these issues and concepts, as well as the limited size of a scientific article and finite patience of the reader, these perspectives are surely incomplete and biased. However, it is expected that some of the questions and trends that are identified will motivate discussions during the IcoBiCoBi round table (with the same name as this article) and perhaps provide a more ample perspective among the participants of that conference and the readers of this text.
ENFIN a network to enhance integrative systems biology.
Kahlem, Pascal; Birney, Ewan
2007-12-01
Integration of biological data of various types and development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing both an adapted infrastructure to connect databases and platforms to enable the generation of new bioinformatics tools as well as the experimental validation of computational predictions. We will give an overview of the projects tackled within ENFIN and discuss the challenges associated with integration for systems biology.
Nawrocki, Eric P.; Burge, Sarah W.
2013-01-01
The development of RNA bioinformatic tools began more than 30 y ago with the description of the Nussinov and Zuker dynamic programming algorithms for single sequence RNA secondary structure prediction. Since then, many tools have been developed for various RNA sequence analysis problems such as homology search, multiple sequence alignment, de novo RNA discovery, read-mapping, and many more. In this issue, we have collected a sampling of reviews and original research that demonstrate some of the many ways bioinformatics is integrated with current RNA biology research. PMID:23948768
Towards barcode markers in Fungi: an intron map of Ascomycota mitochondria.
Santamaria, Monica; Vicario, Saverio; Pappadà, Graziano; Scioscia, Gaetano; Scazzocchio, Claudio; Saccone, Cecilia
2009-06-16
A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality. In particular the potential use of mitochondrial DNA species markers has been taken in account. Unfortunately, a serious difficulty in the PCR and bioinformatic surveys is due to the presence of mobile introns in almost all the fungal mitochondrial genes. The aim of this work is to verify the incidence of this phenomenon in Ascomycota, testing, at the same time, a new bioinformatic tool for extracting and managing sequence databases annotations, in order to identify the mitochondrial gene regions where introns are missing so as to propose them as species markers. The general trend towards a large occurrence of introns in the mitochondrial genome of Fungi has been confirmed in Ascomycota by an extensive bioinformatic analysis, performed on all the entries concerning 11 mitochondrial protein coding genes and 2 mitochondrial rRNA (ribosomal RNA) specifying genes, belonging to this phylum, available in public nucleotide sequence databases. A new query approach has been developed to retrieve effectively introns information included in these entries. After comparing the new query-based approach with a blast-based procedure, with the aim of designing a faithful Ascomycota mitochondrial intron map, the first method appeared clearly the most accurate. Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals. The development of the new query system described here would answer the pressing requirement to improve drastically the bioinformatics support to the DNA Barcode Initiative. The large scale investigation of Ascomycota mitochondrial introns performed through this tool, allowing to exclude the introns-rich sequences from the barcode candidates exploration, could be the first step towards a mitochondrial barcoding strategy for these organisms, similar to the standard approach employed in metazoans.
Costa, José Hélio; Arnholdt-Schmitt, Birgit
2017-01-01
The alternative oxidase (AOX) gene family is a hot candidate for functional marker development that could help plant breeding on yield stability through more robust plants based on multi-stress tolerance. However, there is missing knowledge on the interplay between gene family members that might interfere with the efficiency of marker development. It is common view that AOX1 and AOX2 have different physiological roles. Nevertheless, both family member groups act in terms of molecular-biochemical function as "typical" alternative oxidases and co-regulation of AOX1 and AOX2 had been reported. Although conserved sequence differences had been identified, the basis for differential effects on physiology regulation is not sufficiently explored.This protocol gives instructions for a bioinformatics approach that supports discovering potential interaction of AOX family members in regulating growth and development. It further provides a strategy to elucidate the relevance of gene sequence diversity and copy number variation for final functionality in target tissues and finally the whole plant. Thus, overall this protocol provides the means for efficiently identifying plant AOX variants as functional marker candidates related to growth and development.
G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS.
Hu, Rongdong; Liu, Guangming; Jiang, Jingfei; Wang, Lixin
2015-01-01
Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.
G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS
Hu, Rongdong; Liu, Guangming; Jiang, Jingfei; Wang, Lixin
2015-01-01
Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%. PMID:26504488
ERIC Educational Resources Information Center
Honts, Jerry E.
2003-01-01
Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…
Strategies for Using Peer-Assisted Learning Effectively in an Undergraduate Bioinformatics Course
ERIC Educational Resources Information Center
Shapiro, Casey; Ayon, Carlos; Moberg-Parker, Jordan; Levis-Fitzgerald, Marc; Sanders, Erin R.
2013-01-01
This study used a mixed methods approach to evaluate hybrid peer-assisted learning approaches incorporated into a bioinformatics tutorial for a genome annotation research project. Quantitative and qualitative data were collected from undergraduates who enrolled in a research-based laboratory course during two different academic terms at UCLA.…
Kroll, Torsten; Schmidt, David; Schwanitz, Georg; Ahmad, Mubashir; Hamann, Jana; Schlosser, Corinne; Lin, Yu-Chieh; Böhm, Konrad J; Tuckermann, Jan; Ploubidou, Aspasia
2016-07-01
High-content analysis (HCA) converts raw light microscopy images to quantitative data through the automated extraction, multiparametric analysis, and classification of the relevant information content. Combined with automated high-throughput image acquisition, HCA applied to the screening of chemicals or RNAi-reagents is termed high-content screening (HCS). Its power in quantifying cell phenotypes makes HCA applicable also to routine microscopy. However, developing effective HCA and bioinformatic analysis pipelines for acquisition of biologically meaningful data in HCS is challenging. Here, the step-by-step development of an HCA assay protocol and an HCS bioinformatics analysis pipeline are described. The protocol's power is demonstrated by application to focal adhesion (FA) detection, quantitative analysis of multiple FA features, and functional annotation of signaling pathways regulating FA size, using primary data of a published RNAi screen. The assay and the underlying strategy are aimed at researchers performing microscopy-based quantitative analysis of subcellular features, on a small scale or in large HCS experiments. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Bioinformatics: indispensable, yet hidden in plain sight?
Bartlett, Andrew; Penders, Bart; Lewis, Jamie
2017-06-21
Bioinformatics has multitudinous identities, organisational alignments and disciplinary links. This variety allows bioinformaticians and bioinformatic work to contribute to much (if not most) of life science research in profound ways. The multitude of bioinformatic work also translates into a multitude of credit-distribution arrangements, apparently dismissing that work. We report on the epistemic and social arrangements that characterise the relationship between bioinformatics and life science. We describe, in sociological terms, the character, power and future of bioinformatic work. The character of bioinformatic work is such that its cultural, institutional and technical structures allow for it to be black-boxed easily. The result is that bioinformatic expertise and contributions travel easily and quickly, yet remain largely uncredited. The power of bioinformatic work is shaped by its dependency on life science work, which combined with the black-boxed character of bioinformatic expertise further contributes to situating bioinformatics on the periphery of the life sciences. Finally, the imagined futures of bioinformatic work suggest that bioinformatics will become ever more indispensable without necessarily becoming more visible, forcing bioinformaticians into difficult professional and career choices. Bioinformatic expertise and labour is epistemically central but often institutionally peripheral. In part, this is a result of the ways in which the character, power distribution and potential futures of bioinformatics are constituted. However, alternative paths can be imagined.
ERIC Educational Resources Information Center
Alyuruk, Hakan; Cavas, Levent
2014-01-01
Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…
Development of a Web-Enabled Informatics Platform for Manipulation of Gene Expression Data
2004-12-01
genomic platforms such as metabolomics and proteomics , and to federated databases for knowledge management. A successful SBIR Phase I completed...measurements that require sophisticated bioinformatic platforms for data archival, management, integration, and analysis if researchers are to derive...web-enabled bioinformatic platform consisting of a Laboratory Information Management System (LIMS), an Analysis Information Management System (AIMS
ERIC Educational Resources Information Center
Mello, Luciane V.; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan
2017-01-01
Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student-staff partnership and peer-learning. It was centred on the experiences of peer-facilitators,…
VLSI Microsystem for Rapid Bioinformatic Pattern Recognition
NASA Technical Reports Server (NTRS)
Fang, Wai-Chi; Lue, Jaw-Chyng
2009-01-01
A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).
ISMB 2016 offers outstanding science, networking, and celebration
Fogg, Christiana
2016-01-01
The annual international conference on Intelligent Systems for Molecular Biology (ISMB) is the major meeting of the International Society for Computational Biology (ISCB). Over the past 23 years the ISMB conference has grown to become the world's largest bioinformatics/computational biology conference. ISMB 2016 will be the year's most important computational biology event globally. The conferences provide a multidisciplinary forum for disseminating the latest developments in bioinformatics/computational biology. ISMB brings together scientists from computer science, molecular biology, mathematics, statistics and related fields. Its principal focus is on the development and application of advanced computational methods for biological problems. ISMB 2016 offers the strongest scientific program and the broadest scope of any international bioinformatics/computational biology conference. Building on past successes, the conference is designed to cater to variety of disciplines within the bioinformatics/computational biology community. ISMB 2016 takes place July 8 - 12 at the Swan and Dolphin Hotel in Orlando, Florida, United States. For two days preceding the conference, additional opportunities including Satellite Meetings, Student Council Symposium, and a selection of Special Interest Group Meetings and Applied Knowledge Exchange Sessions (AKES) are all offered to enable registered participants to learn more on the latest methods and tools within specialty research areas. PMID:27347392
ISMB 2016 offers outstanding science, networking, and celebration.
Fogg, Christiana
2016-01-01
The annual international conference on Intelligent Systems for Molecular Biology (ISMB) is the major meeting of the International Society for Computational Biology (ISCB). Over the past 23 years the ISMB conference has grown to become the world's largest bioinformatics/computational biology conference. ISMB 2016 will be the year's most important computational biology event globally. The conferences provide a multidisciplinary forum for disseminating the latest developments in bioinformatics/computational biology. ISMB brings together scientists from computer science, molecular biology, mathematics, statistics and related fields. Its principal focus is on the development and application of advanced computational methods for biological problems. ISMB 2016 offers the strongest scientific program and the broadest scope of any international bioinformatics/computational biology conference. Building on past successes, the conference is designed to cater to variety of disciplines within the bioinformatics/computational biology community. ISMB 2016 takes place July 8 - 12 at the Swan and Dolphin Hotel in Orlando, Florida, United States. For two days preceding the conference, additional opportunities including Satellite Meetings, Student Council Symposium, and a selection of Special Interest Group Meetings and Applied Knowledge Exchange Sessions (AKES) are all offered to enable registered participants to learn more on the latest methods and tools within specialty research areas.
Jones, Bethan M; Edwards, Richard J; Skipp, Paul J; O'Connor, C David; Iglesias-Rodriguez, M Debora
2011-06-01
Emiliania huxleyi is a unicellular marine phytoplankton species known to play a significant role in global biogeochemistry. Through the dual roles of photosynthesis and production of calcium carbonate (calcification), carbon is transferred from the atmosphere to ocean sediments. Almost nothing is known about the molecular mechanisms that control calcification, a process that is tightly regulated within the cell. To initiate proteomic studies on this important and phylogenetically remote organism, we have devised efficient protein extraction protocols and developed a bioinformatics pipeline that allows the statistically robust assignment of proteins from MS/MS data using preexisting EST sequences. The bioinformatics tool, termed BUDAPEST (Bioinformatics Utility for Data Analysis of Proteomics using ESTs), is fully automated and was used to search against data generated from three strains. BUDAPEST increased the number of identifications over standard protein database searches from 37 to 99 proteins when data were amalgamated. Proteins involved in diverse cellular processes were uncovered. For example, experimental evidence was obtained for a novel type I polyketide synthase and for various photosystem components. The proteomic and bioinformatic approaches developed in this study are of wider applicability, particularly to the oceanographic community where genomic sequence data for species of interest are currently scarce.
The GMOD Drupal Bioinformatic Server Framework
Papanicolaou, Alexie; Heckel, David G.
2010-01-01
Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988
PCDDB: new developments at the Protein Circular Dichroism Data Bank.
Whitmore, Lee; Miles, Andrew John; Mavridis, Lazaros; Janes, Robert W; Wallace, B A
2017-01-04
The Protein Circular Dichroism Data Bank (PCDDB) has been in operation for more than 5 years as a public repository for archiving circular dichroism spectroscopic data and associated bioinformatics and experimental metadata. Since its inception, many improvements and new developments have been made in data display, searching algorithms, data formats, data content, auxillary information, and validation techniques, as well as, of course, an increase in the number of holdings. It provides a site (http://pcddb.cryst.bbk.ac.uk) for authors to deposit experimental data as well as detailed information on methods and calculations associated with published work. It also includes links for each entry to bioinformatics databases. The data are freely available to accessors either as single files or as complete data bank downloads. The PCDDB has found broad usage by the structural biology, bioinformatics, analytical and pharmaceutical communities, and has formed the basis for new software and methods developments. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Minie, Mark; Bowers, Stuart; Tarczy-Hornoch, Peter; Roberts, Edward; James, Rose A.; Rambo, Neil; Fuller, Sherrilynne
2006-01-01
Setting: The University of Washington Health Sciences Libraries and Information Center BioCommons serves the bioinformatics needs of researchers at the university and in the vibrant for-profit and not-for-profit biomedical research sector in the Washington area and region. Program Components: The BioCommons comprises services addressing internal University of Washington, not-for-profit, for-profit, and regional and global clientele. The BioCommons is maintained and administered by the BioResearcher Liaison Team. The BioCommons architecture provides a highly flexible structure for adapting to rapidly changing resources and needs. Evaluation Mechanisms: BioCommons uses Web-based pre- and post-course evaluations and periodic user surveys to assess service effectiveness. Recent surveys indicate substantial usage of BioCommons services and a high level of effectiveness and user satisfaction. Next Steps/Future Directions: BioCommons is developing novel collaborative Web resources to distribute bioinformatics tools and is experimenting with Web-based competency training in bioinformation resource use. PMID:16888667
A vision for collaborative training infrastructure for bioinformatics.
Williams, Jason J; Teal, Tracy K
2017-01-01
In biology, a missing link connecting data generation and data-driven discovery is the training that prepares researchers to effectively manage and analyze data. National and international cyberinfrastructure along with evolving private sector resources place biologists and students within reach of the tools needed for data-intensive biology, but training is still required to make effective use of them. In this concept paper, we review a number of opportunities and challenges that can inform the creation of a national bioinformatics training infrastructure capable of servicing the large number of emerging and existing life scientists. While college curricula are slower to adapt, grassroots startup-spirited organizations, such as Software and Data Carpentry, have made impressive inroads in training on the best practices of software use, development, and data analysis. Given the transformative potential of biology and medicine as full-fledged data sciences, more support is needed to organize, amplify, and assess these efforts and their impacts. © 2016 New York Academy of Sciences.
Yu, J; Blom, J; Glaeser, S P; Jaenicke, S; Juhre, T; Rupp, O; Schwengers, O; Spänig, S; Goesmann, A
2017-11-10
The rapid development of next generation sequencing technology has greatly increased the amount of available microbial genomes. As a result of this development, there is a rising demand for fast and automated approaches in analyzing these genomes in a comparative way. Whole genome sequencing also bears a huge potential for obtaining a higher resolution in phylogenetic and taxonomic classification. During the last decade, several software tools and platforms have been developed in the field of comparative genomics. In this manuscript, we review the most commonly used platforms and approaches for ortholog group analyses with a focus on their potential for phylogenetic and taxonomic research. Furthermore, we describe the latest improvements of the EDGAR platform for comparative genome analyses and present recent examples of its application for the phylogenomic analysis of different taxa. Finally, we illustrate the role of the EDGAR platform as part of the BiGi Center for Microbial Bioinformatics within the German network on Bioinformatics Infrastructure (de.NBI). Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Implementation of Cloud based next generation sequencing data analysis in a clinical laboratory.
Onsongo, Getiria; Erdmann, Jesse; Spears, Michael D; Chilton, John; Beckman, Kenneth B; Hauge, Adam; Yohe, Sophia; Schomaker, Matthew; Bower, Matthew; Silverstein, Kevin A T; Thyagarajan, Bharat
2014-05-23
The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.
Mathematics and evolutionary biology make bioinformatics education comprehensible.
Jungck, John R; Weisstein, Anton E
2013-09-01
The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.
Mathematics and evolutionary biology make bioinformatics education comprehensible
Weisstein, Anton E.
2013-01-01
The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621
Patel, Ashokkumar A; Gilbertson, John R; Showe, Louise C; London, Jack W; Ross, Eric; Ochs, Michael F; Carver, Joseph; Lazarus, Andrea; Parwani, Anil V; Dhir, Rajiv; Beck, J Robert; Liebman, Michael; Garcia, Fernando U; Prichard, Jeff; Wilkerson, Myra; Herberman, Ronald B; Becich, Michael J
2007-06-08
The Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC, http://www.pcabc.upmc.edu) is one of the first major project-based initiatives stemming from the Pennsylvania Cancer Alliance that was funded for four years by the Department of Health of the Commonwealth of Pennsylvania. The objective of this was to initiate a prototype biorepository and bioinformatics infrastructure with a robust data warehouse by developing a statewide data model (1) for bioinformatics and a repository of serum and tissue samples; (2) a data model for biomarker data storage; and (3) a public access website for disseminating research results and bioinformatics tools. The members of the Consortium cooperate closely, exploring the opportunity for sharing clinical, genomic and other bioinformatics data on patient samples in oncology, for the purpose of developing collaborative research programs across cancer research institutions in Pennsylvania. The Consortium's intention was to establish a virtual repository of many clinical specimens residing in various centers across the state, in order to make them available for research. One of our primary goals was to facilitate the identification of cancer-specific biomarkers and encourage collaborative research efforts among the participating centers. The PCABC has developed unique partnerships so that every region of the state can effectively contribute and participate. It includes over 80 individuals from 14 organizations, and plans to expand to partners outside the State. This has created a network of researchers, clinicians, bioinformaticians, cancer registrars, program directors, and executives from academic and community health systems, as well as external corporate partners - all working together to accomplish a common mission. The various sub-committees have developed a common IRB protocol template, common data elements for standardizing data collections for three organ sites, intellectual property/tech transfer agreements, and material transfer agreements that have been approved by each of the member institutions. This was the foundational work that has led to the development of a centralized data warehouse that has met each of the institutions' IRB/HIPAA standards. Currently, this "virtual biorepository" has over 58,000 annotated samples from 11,467 cancer patients available for research purposes. The clinical annotation of tissue samples is either done manually over the internet or semi-automated batch modes through mapping of local data elements with PCABC common data elements. The database currently holds information on 7188 cases (associated with 9278 specimens and 46,666 annotated blocks and blood samples) of prostate cancer, 2736 cases (associated with 3796 specimens and 9336 annotated blocks and blood samples) of breast cancer and 1543 cases (including 1334 specimens and 2671 annotated blocks and blood samples) of melanoma. These numbers continue to grow, and plans to integrate new tumor sites are in progress. Furthermore, the group has also developed a central web-based tool that allows investigators to share their translational (genomics/proteomics) experiment data on research evaluating potential biomarkers via a central location on the Consortium's web site. The technological achievements and the statewide informatics infrastructure that have been established by the Consortium will enable robust and efficient studies of biomarkers and their relevance to the clinical course of cancer. Studies resulting from the creation of the Consortium may allow for better classification of cancer types, more accurate assessment of disease prognosis, a better ability to identify the most appropriate individuals for clinical trial participation, and better surrogate markers of disease progression and/or response to therapy.
Patel, Ashokkumar A.; Gilbertson, John R.; Showe, Louise C.; London, Jack W.; Ross, Eric; Ochs, Michael F.; Carver, Joseph; Lazarus, Andrea; Parwani, Anil V.; Dhir, Rajiv; Beck, J. Robert; Liebman, Michael; Garcia, Fernando U.; Prichard, Jeff; Wilkerson, Myra; Herberman, Ronald B.; Becich, Michael J.
2007-01-01
Background: The Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC, http://www.pcabc.upmc.edu) is one of the first major project-based initiatives stemming from the Pennsylvania Cancer Alliance that was funded for four years by the Department of Health of the Commonwealth of Pennsylvania. The objective of this was to initiate a prototype biorepository and bioinformatics infrastructure with a robust data warehouse by developing a statewide data model (1) for bioinformatics and a repository of serum and tissue samples; (2) a data model for biomarker data storage; and (3) a public access website for disseminating research results and bioinformatics tools. The members of the Consortium cooperate closely, exploring the opportunity for sharing clinical, genomic and other bioinformatics data on patient samples in oncology, for the purpose of developing collaborative research programs across cancer research institutions in Pennsylvania. The Consortium’s intention was to establish a virtual repository of many clinical specimens residing in various centers across the state, in order to make them available for research. One of our primary goals was to facilitate the identification of cancer-specific biomarkers and encourage collaborative research efforts among the participating centers. Methods: The PCABC has developed unique partnerships so that every region of the state can effectively contribute and participate. It includes over 80 individuals from 14 organizations, and plans to expand to partners outside the State. This has created a network of researchers, clinicians, bioinformaticians, cancer registrars, program directors, and executives from academic and community health systems, as well as external corporate partners - all working together to accomplish a common mission. The various sub-committees have developed a common IRB protocol template, common data elements for standardizing data collections for three organ sites, intellectual property/tech transfer agreements, and material transfer agreements that have been approved by each of the member institutions. This was the foundational work that has led to the development of a centralized data warehouse that has met each of the institutions’ IRB/HIPAA standards. Results: Currently, this “virtual biorepository” has over 58,000 annotated samples from 11,467 cancer patients available for research purposes. The clinical annotation of tissue samples is either done manually over the internet or semi-automated batch modes through mapping of local data elements with PCABC common data elements. The database currently holds information on 7188 cases (associated with 9278 specimens and 46,666 annotated blocks and blood samples) of prostate cancer, 2736 cases (associated with 3796 specimens and 9336 annotated blocks and blood samples) of breast cancer and 1543 cases (including 1334 specimens and 2671 annotated blocks and blood samples) of melanoma. These numbers continue to grow, and plans to integrate new tumor sites are in progress. Furthermore, the group has also developed a central web-based tool that allows investigators to share their translational (genomics/proteomics) experiment data on research evaluating potential biomarkers via a central location on the Consortium’s web site. Conclusions: The technological achievements and the statewide informatics infrastructure that have been established by the Consortium will enable robust and efficient studies of biomarkers and their relevance to the clinical course of cancer. Studies resulting from the creation of the Consortium may allow for better classification of cancer types, more accurate assessment of disease prognosis, a better ability to identify the most appropriate individuals for clinical trial participation, and better surrogate markers of disease progression and/or response to therapy. PMID:19455246
Eisenhaber, Frank
2014-06-01
Remarkably, Singapore as one of today's hotspots for bioinformatics and computational biology research appeared de novo out of pioneering efforts of engaged local individuals in the early 90-s that, supported with increasing public funds from 1996 on, morphed into the present vibrant research community. This article brings to mind the pioneers, their first successes and early institutional developments.
BIOINFORMATICS IN THE K-8 CLASSROOM: DESIGNING INNOVATIVE ACTIVITIES FOR TEACHER IMPLEMENTATION
Shuster, Michele; Claussen, Kira; Locke, Melly; Glazewski, Krista
2016-01-01
At the intersection of biology and computer science is the growing field of bioinformatics—the analysis of complex datasets of biological relevance. Despite the increasing importance of bioinformatics and associated practical applications, these are not standard topics in elementary and middle school classrooms. We report on a pilot project and its evolution to support implementation of bioinformatics-based activities in elementary and middle school classrooms. Specifically, we ultimately designed a multi-day summer teacher professional development workshop, in which teachers design innovative classroom activities. By focusing on teachers, our design leverages enhanced teacher knowledge and confidence to integrate innovative instructional materials into K-8 classrooms and contributes to capacity building in STEM instruction. PMID:27429860
Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity
NASA Astrophysics Data System (ADS)
Widodo
2015-02-01
Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another pharmaceutical material.
Microbial bioinformatics for food safety and production
Alkema, Wynand; Boekhorst, Jos; Wels, Michiel
2016-01-01
In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput ‘omics’ technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety. PMID:26082168
Navigating the changing learning landscape: perspective from bioinformatics.ca
Ouellette, B. F. Francis
2013-01-01
With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468
Bioinformatic training needs at a health sciences campus.
Oliver, Jeffrey C
2017-01-01
Health sciences research is increasingly focusing on big data applications, such as genomic technologies and precision medicine, to address key issues in human health. These approaches rely on biological data repositories and bioinformatic analyses, both of which are growing rapidly in size and scope. Libraries play a key role in supporting researchers in navigating these and other information resources. With the goal of supporting bioinformatics research in the health sciences, the University of Arizona Health Sciences Library established a Bioinformation program. To shape the support provided by the library, I developed and administered a needs assessment survey to the University of Arizona Health Sciences campus in Tucson, Arizona. The survey was designed to identify the training topics of interest to health sciences researchers and the preferred modes of training. Survey respondents expressed an interest in a broad array of potential training topics, including "traditional" information seeking as well as interest in analytical training. Of particular interest were training in transcriptomic tools and the use of databases linking genotypes and phenotypes. Staff were most interested in bioinformatics training topics, while faculty were the least interested. Hands-on workshops were significantly preferred over any other mode of training. The University of Arizona Health Sciences Library is meeting those needs through internal programming and external partnerships. The results of the survey demonstrate a keen interest in a variety of bioinformatic resources; the challenge to the library is how to address those training needs. The mode of support depends largely on library staff expertise in the numerous subject-specific databases and tools. Librarian-led bioinformatic training sessions provide opportunities for engagement with researchers at multiple points of the research life cycle. When training needs exceed library capacity, partnering with intramural and extramural units will be crucial in library support of health sciences bioinformatic research.
Towards a career in bioinformatics
2009-01-01
The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508
Towards a career in bioinformatics.
Ranganathan, Shoba
2009-12-03
The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010.
NASA Astrophysics Data System (ADS)
Feodorova, Valentina A.; Saltykov, Yury V.; Zaytsev, Sergey S.; Ulyanov, Sergey S.; Ulianova, Onega V.
2018-04-01
Method of phase-shifting speckle-interferometry has been used as a new tool with high potency for modern bioinformatics. Virtual phase-shifting speckle-interferometry has been applied for detection of polymorphism in the of Chlamydia trachomatis omp1 gene. It has been shown, that suggested method is very sensitive to natural genetic mutations as single nucleotide polymorphism (SNP). Effectiveness of proposed method has been compared with effectiveness of the newest bioinformatic tools, based on nucleotide sequence alignment.
Lipidomics informatics for life-science.
Schwudke, D; Shevchenko, A; Hoffmann, N; Ahrends, R
2017-11-10
Lipidomics encompasses analytical approaches that aim to identify and quantify the complete set of lipids, defined as lipidome in a given cell, tissue or organism as well as their interactions with other molecules. The majority of lipidomics workflows is based on mass spectrometry and has been proven as a powerful tool in system biology in concert with other Omics disciplines. Unfortunately, bioinformatics infrastructures for this relatively young discipline are limited only to some specialists. Search engines, quantification algorithms, visualization tools and databases developed by the 'Lipidomics Informatics for Life-Science' (LIFS) partners will be restructured and standardized to provide broad access to these specialized bioinformatics pipelines. There are many medical challenges related to lipid metabolic alterations that will be fostered by capacity building suggested by LIFS. LIFS as member of the 'German Network for Bioinformatics' (de.NBI) node for 'Bioinformatics for Proteomics' (BioInfra.Prot) and will provide access to the described software as well as to tutorials and consulting services via a unified web-portal. Copyright © 2017 Elsevier B.V. All rights reserved.
Application of bioinformatics in chronobiology research.
Lopes, Robson da Silva; Resende, Nathalia Maria; Honorio-França, Adenilda Cristina; França, Eduardo Luzía
2013-01-01
Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through "omics" projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.
Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.
Krampis, Konstantinos; Booth, Tim; Chapman, Brad; Tiwari, Bela; Bicak, Mesude; Field, Dawn; Nelson, Karen E
2012-03-19
A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.
Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community
2012-01-01
Background A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them. PMID:22429538
Wilson, Justin; Dai, Manhong; Jakupovic, Elvis; Watson, Stanley; Meng, Fan
2007-01-01
Modern video cards and game consoles typically have much better performance to price ratios than that of general purpose CPUs. The parallel processing capabilities of game hardware are well-suited for high throughput biomedical data analysis. Our initial results suggest that game hardware is a cost-effective platform for some computationally demanding bioinformatics problems.
The Web as an educational tool for/in learning/teaching bioinformatics statistics.
Oliver, J; Pisano, M E; Alonso, T; Roca, P
2005-12-01
Statistics provides essential tool in Bioinformatics to interpret the results of a database search or for the management of enormous amounts of information provided from genomics, proteomics and metabolomics. The goal of this project was the development of a software tool that would be as simple as possible to demonstrate the use of the Bioinformatics statistics. Computer Simulation Methods (CSMs) developed using Microsoft Excel were chosen for their broad range of applications, immediate and easy formula calculation, immediate testing and easy graphics representation, and of general use and acceptance by the scientific community. The result of these endeavours is a set of utilities which can be accessed from the following URL: http://gmein.uib.es/bioinformatica/statistics. When tested on students with previous coursework with traditional statistical teaching methods, the general opinion/overall consensus was that Web-based instruction had numerous advantages, but traditional methods with manual calculations were also needed for their theory and practice. Once having mastered the basic statistical formulas, Excel spreadsheets and graphics were shown to be very useful for trying many parameters in a rapid fashion without having to perform tedious calculations. CSMs will be of great importance for the formation of the students and professionals in the field of bioinformatics, and for upcoming applications of self-learning and continuous formation.
Suh, K. Stephen; Sarojini, Sreeja; Youssif, Maher; Nalley, Kip; Milinovikj, Natasha; Elloumi, Fathi; Russell, Steven; Pecora, Andrew; Schecter, Elyssa; Goy, Andre
2013-01-01
Personalized medicine promises patient-tailored treatments that enhance patient care and decrease overall treatment costs by focusing on genetics and “-omics” data obtained from patient biospecimens and records to guide therapy choices that generate good clinical outcomes. The approach relies on diagnostic and prognostic use of novel biomarkers discovered through combinations of tissue banking, bioinformatics, and electronic medical records (EMRs). The analytical power of bioinformatic platforms combined with patient clinical data from EMRs can reveal potential biomarkers and clinical phenotypes that allow researchers to develop experimental strategies using selected patient biospecimens stored in tissue banks. For cancer, high-quality biospecimens collected at diagnosis, first relapse, and various treatment stages provide crucial resources for study designs. To enlarge biospecimen collections, patient education regarding the value of specimen donation is vital. One approach for increasing consent is to offer publically available illustrations and game-like engagements demonstrating how wider sample availability facilitates development of novel therapies. The critical value of tissue bank samples, bioinformatics, and EMR in the early stages of the biomarker discovery process for personalized medicine is often overlooked. The data obtained also require cross-disciplinary collaborations to translate experimental results into clinical practice and diagnostic and prognostic use in personalized medicine. PMID:23818899
Composable languages for bioinformatics: the NYoSh experiment
Simi, Manuele
2014-01-01
Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is distributed at http://nyosh.campagnelab.org. PMID:24482760
Composable languages for bioinformatics: the NYoSh experiment.
Simi, Manuele; Campagne, Fabien
2014-01-01
Language WorkBenches (LWBs) are software engineering tools that help domain experts develop solutions to various classes of problems. Some of these tools focus on non-technical users and provide languages to help organize knowledge while other workbenches provide means to create new programming languages. A key advantage of language workbenches is that they support the seamless composition of independently developed languages. This capability is useful when developing programs that can benefit from different levels of abstraction. We reasoned that language workbenches could be useful to develop bioinformatics software solutions. In order to evaluate the potential of language workbenches in bioinformatics, we tested a prominent workbench by developing an alternative to shell scripting. To illustrate what LWBs and Language Composition can bring to bioinformatics, we report on our design and development of NYoSh (Not Your ordinary Shell). NYoSh was implemented as a collection of languages that can be composed to write programs as expressive and concise as shell scripts. This manuscript offers a concrete illustration of the advantages and current minor drawbacks of using the MPS LWB. For instance, we found that we could implement an environment-aware editor for NYoSh that can assist the programmers when developing scripts for specific execution environments. This editor further provides semantic error detection and can be compiled interactively with an automatic build and deployment system. In contrast to shell scripts, NYoSh scripts can be written in a modern development environment, supporting context dependent intentions and can be extended seamlessly by end-users with new abstractions and language constructs. We further illustrate language extension and composition with LWBs by presenting a tight integration of NYoSh scripts with the GobyWeb system. The NYoSh Workbench prototype, which implements a fully featured integrated development environment for NYoSh is distributed at http://nyosh.campagnelab.org.
Application of bioinformatics tools and databases in microbial dehalogenation research (a review).
Satpathy, R; Konkimalla, V B; Ratha, J
2015-01-01
Microbial dehalogenation is a biochemical process in which the halogenated substances are catalyzed enzymatically in to their non-halogenated form. The microorganisms have a wide range of organohalogen degradation ability both explicit and non-specific in nature. Most of these halogenated organic compounds being pollutants need to be remediated; therefore, the current approaches are to explore the potential of microbes at a molecular level for effective biodegradation of these substances. Several microorganisms with dehalogenation activity have been identified and characterized. In this aspect, the bioinformatics plays a key role to gain deeper knowledge in this field of dehalogenation. To facilitate the data mining, many tools have been developed to annotate these data from databases. Therefore, with the discovery of a microorganism one can predict a gene/protein, sequence analysis, can perform structural modelling, metabolic pathway analysis, biodegradation study and so on. This review highlights various methods of bioinformatics approach that describes the application of various databases and specific tools in the microbial dehalogenation fields with special focus on dehalogenase enzymes. Attempts have also been made to decipher some recent applications of in silico modeling methods that comprise of gene finding, protein modelling, Quantitative Structure Biodegradibility Relationship (QSBR) study and reconstruction of metabolic pathways employed in dehalogenation research area.
SeqHound: biological sequence and structure database as a platform for bioinformatics research
2002-01-01
Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit. PMID:12401134
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Po-E; Lo, Chien -Chi; Anderson, Joseph J.
Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the easemore » of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. As a result, this bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research.« less
Li, Po-E; Lo, Chien-Chi; Anderson, Joseph J.; Davenport, Karen W.; Bishop-Lilly, Kimberly A.; Xu, Yan; Ahmed, Sanaa; Feng, Shihai; Mokashi, Vishwesh P.; Chain, Patrick S.G.
2017-01-01
Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the ease of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. This bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research. PMID:27899609
Li, Po-E; Lo, Chien -Chi; Anderson, Joseph J.; ...
2016-11-24
Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the easemore » of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. As a result, this bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research.« less
Using Next-Generation Sequencing to Explore Genetics and Race in the High School Classroom
Yang, Xinmiao; Hartman, Mark R.; Harrington, Kristin T.; Etson, Candice M.; Fierman, Matthew B.; Slonim, Donna K.; Walt, David R.
2017-01-01
With the development of new sequencing and bioinformatics technologies, concepts relating to personal genomics play an increasingly important role in our society. To promote interest and understanding of sequencing and bioinformatics in the high school classroom, we developed and implemented a laboratory-based teaching module called “The Genetics of Race.” This module uses the topic of race to engage students with sequencing and genetics. In the experimental portion of this module, students isolate their own mitochondrial DNA using standard biotechnology techniques and collect next-generation sequencing data to determine which of their classmates are most and least genetically similar to themselves. We evaluated the efficacy of this module by administering a pretest/posttest evaluation to measure student knowledge related to sequencing and bioinformatics, and we also conducted a survey at the conclusion of the module to assess student attitudes. Upon completion of our Genetics of Race module, students demonstrated significant learning gains, with lower-performing students obtaining the highest gains, and developed more positive attitudes toward scientific research. PMID:28408407
Bioinformatic approaches to augment study of epithelial-to-mesenchymal transition in lung cancer
Beck, Tim N.; Chikwem, Adaeze J.; Solanki, Nehal R.
2014-01-01
Bioinformatic approaches are intended to provide systems level insight into the complex biological processes that underlie serious diseases such as cancer. In this review we describe current bioinformatic resources, and illustrate how they have been used to study a clinically important example: epithelial-to-mesenchymal transition (EMT) in lung cancer. Lung cancer is the leading cause of cancer-related deaths and is often diagnosed at advanced stages, leading to limited therapeutic success. While EMT is essential during development and wound healing, pathological reactivation of this program by cancer cells contributes to metastasis and drug resistance, both major causes of death from lung cancer. Challenges of studying EMT include its transient nature, its molecular and phenotypic heterogeneity, and the complicated networks of rewired signaling cascades. Given the biology of lung cancer and the role of EMT, it is critical to better align the two in order to advance the impact of precision oncology. This task relies heavily on the application of bioinformatic resources. Besides summarizing recent work in this area, we use four EMT-associated genes, TGF-β (TGFB1), NEDD9/HEF1, β-catenin (CTNNB1) and E-cadherin (CDH1), as exemplars to demonstrate the current capacities and limitations of probing bioinformatic resources to inform hypothesis-driven studies with therapeutic goals. PMID:25096367
A label distance maximum-based classifier for multi-label learning.
Liu, Xiaoli; Bao, Hang; Zhao, Dazhe; Cao, Peng
2015-01-01
Multi-label classification is useful in many bioinformatics tasks such as gene function prediction and protein site localization. This paper presents an improved neural network algorithm, Max Label Distance Back Propagation Algorithm for Multi-Label Classification. The method was formulated by modifying the total error function of the standard BP by adding a penalty term, which was realized by maximizing the distance between the positive and negative labels. Extensive experiments were conducted to compare this method against state-of-the-art multi-label methods on three popular bioinformatic benchmark datasets. The results illustrated that this proposed method is more effective for bioinformatic multi-label classification compared to commonly used techniques.
Challenges of Identifying Clinically Actionable Genetic Variants for Precision Medicine
2016-01-01
Advances in genomic medicine have the potential to change the way we treat human disease, but translating these advances into reality for improving healthcare outcomes depends essentially on our ability to discover disease- and/or drug-associated clinically actionable genetic mutations. Integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a big data infrastructure can provide an efficient and effective way to identify clinically actionable genetic variants for personalized treatments and reduce healthcare costs. We review bioinformatics processing of next-generation sequencing (NGS) data, bioinformatics infrastructures for implementing precision medicine, and bioinformatics approaches for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs. PMID:27195526
Pathway mapping and development of disease-specific biomarkers: protein-based network biomarkers
Chen, Hao; Zhu, Zhitu; Zhu, Yichun; Wang, Jian; Mei, Yunqing; Cheng, Yunfeng
2015-01-01
It is known that a disease is rarely a consequence of an abnormality of a single gene, but reflects the interactions of various processes in a complex network. Annotated molecular networks offer new opportunities to understand diseases within a systems biology framework and provide an excellent substrate for network-based identification of biomarkers. The network biomarkers and dynamic network biomarkers (DNBs) represent new types of biomarkers with protein–protein or gene–gene interactions that can be monitored and evaluated at different stages and time-points during development of disease. Clinical bioinformatics as a new way to combine clinical measurements and signs with human tissue-generated bioinformatics is crucial to translate biomarkers into clinical application, validate the disease specificity, and understand the role of biomarkers in clinical settings. In this article, the recent advances and developments on network biomarkers and DNBs are comprehensively reviewed. How network biomarkers help a better understanding of molecular mechanism of diseases, the advantages and constraints of network biomarkers for clinical application, clinical bioinformatics as a bridge to the development of diseases-specific, stage-specific, severity-specific and therapy predictive biomarkers, and the potentials of network biomarkers are also discussed. PMID:25560835
Seto, Jason; Walsh, Michael P.; Mahadevan, Padmanabhan; Zhang, Qiwei; Seto, Donald
2010-01-01
Technological advances and increasingly cost-effect methodologies in DNA sequencing and computational analysis are providing genome and proteome data for human adenovirus research. Applying these tools, data and derived knowledge to the development of vaccines against these pathogens will provide effective prophylactics. The same data and approaches can be applied to vector development for gene delivery in gene therapy and vaccine delivery protocols. Examination of several field strain genomes and their analyses provide examples of data that are available using these approaches. An example of the development of HAdV-B3 both as a vaccine and also as a vector is presented. PMID:21994597
Stevens, David Cole; Conway, Kyle R.; Pearce, Nelson; Villegas-Peñaranda, Luis Roberto; Garza, Anthony G.; Boddy, Christopher N.
2013-01-01
Background Heterologous expression of bacterial biosynthetic gene clusters is currently an indispensable tool for characterizing biosynthetic pathways. Development of an effective, general heterologous expression system that can be applied to bioprospecting from metagenomic DNA will enable the discovery of a wealth of new natural products. Methodology We have developed a new Escherichia coli-based heterologous expression system for polyketide biosynthetic gene clusters. We have demonstrated the over-expression of the alternative sigma factor σ54 directly and positively regulates heterologous expression of the oxytetracycline biosynthetic gene cluster in E. coli. Bioinformatics analysis indicates that σ54 promoters are present in nearly 70% of polyketide and non-ribosomal peptide biosynthetic pathways. Conclusions We have demonstrated a new mechanism for heterologous expression of the oxytetracycline polyketide biosynthetic pathway, where high-level pleiotropic sigma factors from the heterologous host directly and positively regulate transcription of the non-native biosynthetic gene cluster. Our bioinformatics analysis is consistent with the hypothesis that heterologous expression mediated by the alternative sigma factor σ54 may be a viable method for the production of additional polyketide products. PMID:23724102
The carbohydrate sequence markup language (CabosML): an XML description of carbohydrate structures.
Kikuchi, Norihiro; Kameyama, Akihiko; Nakaya, Shuuichi; Ito, Hiromi; Sato, Takashi; Shikanai, Toshihide; Takahashi, Yoriko; Narimatsu, Hisashi
2005-04-15
Bioinformatics resources for glycomics are very poor as compared with those for genomics and proteomics. The complexity of carbohydrate sequences makes it difficult to define a common language to represent them, and the development of bioinformatics tools for glycomics has not progressed. In this study, we developed a carbohydrate sequence markup language (CabosML), an XML description of carbohydrate structures. The language definition (XML Schema) and an experimental database of carbohydrate structures using an XML database management system are available at http://www.phoenix.hydra.mki.co.jp/CabosDemo.html kikuchi@hydra.mki.co.jp.
Building international genomics collaboration for global health security
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.
Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installationmore » of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.« less
Building international genomics collaboration for global health security
Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.; ...
2015-12-07
Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installationmore » of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.« less
Soulet, Fabienne; Kilarski, Witold W.; Roux-Dalvai, Florence; Herbert, John M. J.; Sacewicz, Izabela; Mouton-Barbosa, Emmanuelle; Bicknell, Roy; Lalor, Patricia; Monsarrat, Bernard; Bikfalvi, Andreas
2013-01-01
In order to map the extracellular or membrane proteome associated with the vasculature and the stroma in an embryonic organism in vivo, we developed a biotinylation technique for chicken embryo and combined it with mass spectrometry and bioinformatic analysis. We also applied this procedure to implanted tumors growing on the chorioallantoic membrane or after the induction of granulation tissue. Membrane and extracellular matrix proteins were the most abundant components identified. Relative quantitative analysis revealed differential protein expression patterns in several tissues. Through a bioinformatic approach, we determined endothelial cell protein expression signatures, which allowed us to identify several proteins not yet reported to be associated with endothelial cells or the vasculature. This is the first study reported so far that applies in vivo biotinylation, in combination with robust label-free quantitative proteomics approaches and bioinformatic analysis, to an embryonic organism. It also provides the first description of the vascular and matrix proteome of the embryo that might constitute the starting point for further developments. PMID:23674615
Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba
2016-12-22
The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.
NASA Astrophysics Data System (ADS)
Serve, Kinta M.
Part I. Pleural fibrosis, a non-malignant, asbestos-related respiratory disease characterized by excessive collagen deposition, is progressive, debilitating, and potentially fatal. Disease severity may be influenced by the type of asbestos fiber inhaled, with Libby amphibole (LA) a seemingly more potent mediator of pleural fibrosis than chrysotile (CH) asbestos. This difference in severity may be due to the reported immunological component associated with LA but not CH related diseases. Here, we report the potential mechanisms by which asbestos-associated mesothelial cell autoantibodies (MCAAs) contribute to pleural fibrosis development. MCAAs are shown to bind cultured human pleural mesothelial cells and induce the deposition of type I collagen proteins in the absence of phenotypic changes typically associated with fibrosis development. However, additional extracellular proteins seem to differentially contribute to LA and CH MCAA-associated collagen deposition. Our data also suggest that IgG subclass distributions differ between LA and CH MCAAs, potentially altering the antibody effector functions. Differences in MCAA mechanisms of action and effector functions may help explain the disparate clinical disease phenotypes noted between LA and CH-exposed populations and may provide insights for development of novel therapeutic strategies. Part II. As scientific research becomes increasingly reliant on computational tools, it is more important than ever before to train students to use these tools. While educators agree that biology students should gain experience with bioinformatics, there exists no consensus as to how to integrate these concepts into the already demanding undergraduate curriculum. The Portal-21 project offers a solution by utilizing on-line learning case studies to allow flexibility for classroom integration. Presented here are the results from two field tests of a case study developed to introduce the common bioinformatics tools pBLAST and PubMed to undergraduate students while reinforcing concepts of protein function. Data suggest positive gains in student learning and confidence with using bioinformatics tools following use of the case study. These results indicate that on-line case studies are a useful tool for introducing bioinformatics into undergraduate classrooms.
Open discovery: An integrated live Linux platform of Bioinformatics tools.
Vetrivel, Umashankar; Pilla, Kalabharath
2008-01-01
Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery - a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in.
Trace Elements and Healthcare: A Bioinformatics Perspective.
Zhang, Yan
2017-01-01
Biological trace elements are essential for human health. Imbalance in trace element metabolism and homeostasis may play an important role in a variety of diseases and disorders. While the majority of previous researches focused on experimental verification of genes involved in trace element metabolism and those encoding trace element-dependent proteins, bioinformatics study on trace elements is relatively rare and still at the starting stage. This chapter offers an overview of recent progress in bioinformatics analyses of trace element utilization, metabolism, and function, especially comparative genomics of several important metals. The relationship between individual elements and several diseases based on recent large-scale systematic studies such as genome-wide association studies and case-control studies is discussed. Lastly, developments of ionomics and its recent application in human health are also introduced.
Advances in Omics and Bioinformatics Tools for Systems Analyses of Plant Functions
Mochida, Keiichi; Shinozaki, Kazuo
2011-01-01
Omics and bioinformatics are essential to understanding the molecular systems that underlie various plant functions. Recent game-changing sequencing technologies have revitalized sequencing approaches in genomics and have produced opportunities for various emerging analytical applications. Driven by technological advances, several new omics layers such as the interactome, epigenome and hormonome have emerged. Furthermore, in several plant species, the development of omics resources has progressed to address particular biological properties of individual species. Integration of knowledge from omics-based research is an emerging issue as researchers seek to identify significance, gain biological insights and promote translational research. From these perspectives, we provide this review of the emerging aspects of plant systems research based on omics and bioinformatics analyses together with their associated resources and technological advances. PMID:22156726
Personalized medicine: challenges and opportunities for translational bioinformatics
Overby, Casey Lynnette; Tarczy-Hornoch, Peter
2013-01-01
Personalized medicine can be defined broadly as a model of healthcare that is predictive, personalized, preventive and participatory. Two US President’s Council of Advisors on Science and Technology reports illustrate challenges in personalized medicine (in a 2008 report) and in use of health information technology (in a 2010 report). Translational bioinformatics is a field that can help address these challenges and is defined by the American Medical Informatics Association as “the development of storage, analytic and interpretive methods to optimize the transformation of increasing voluminous biomedical data into proactive, predictive, preventative and participatory health.” This article discusses barriers to implementing genomics applications and current progress toward overcoming barriers, describes lessons learned from early experiences of institutions engaged in personalized medicine and provides example areas for translational bioinformatics research inquiry. PMID:24039624
A case study of tuning MapReduce for efficient Bioinformatics in the cloud
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Lizhen; Wang, Zhong; Yu, Weikuan
The combination of the Hadoop MapReduce programming model and cloud computing allows biological scientists to analyze next-generation sequencing (NGS) data in a timely and cost-effective manner. Cloud computing platforms remove the burden of IT facility procurement and management from end users and provide ease of access to Hadoop clusters. However, biological scientists are still expected to choose appropriate Hadoop parameters for running their jobs. More importantly, the available Hadoop tuning guidelines are either obsolete or too general to capture the particular characteristics of bioinformatics applications. In this paper, we aim to minimize the cloud computing cost spent on bioinformatics datamore » analysis by optimizing the extracted significant Hadoop parameters. When using MapReduce-based bioinformatics tools in the cloud, the default settings often lead to resource underutilization and wasteful expenses. We choose k-mer counting, a representative application used in a large number of NGS data analysis tools, as our study case. Experimental results show that, with the fine-tuned parameters, we achieve a total of 4× speedup compared with the original performance (using the default settings). Finally, this paper presents an exemplary case for tuning MapReduce-based bioinformatics applications in the cloud, and documents the key parameters that could lead to significant performance benefits.« less
A primer to frequent itemset mining for bioinformatics
Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart
2015-01-01
Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. PMID:24162173
Best practices in bioinformatics training for life scientists.
Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K
2013-09-01
The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.
Phylogenetic trees in bioinformatics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burr, Tom L
2008-01-01
Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding themore » best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.« less
Best practices in bioinformatics training for life scientists
Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D.; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L.; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C.; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrönen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K.
2013-01-01
The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists. PMID:23803301
Web-based services for drug design and discovery.
Frey, Jeremy G; Bird, Colin L
2011-09-01
Reviews of the development of drug discovery through the 20(th) century recognised the importance of chemistry and increasingly bioinformatics, but had relatively little to say about the importance of computing and networked computing in particular. However, the design and discovery of new drugs is arguably the most significant single application of bioinformatics and cheminformatics to have benefitted from the increases in the range and power of the computational techniques since the emergence of the World Wide Web, commonly now referred to as simply 'the Web'. Web services have enabled researchers to access shared resources and to deploy standardized calculations in their search for new drugs. This article first considers the fundamental principles of Web services and workflows, and then explores the facilities and resources that have evolved to meet the specific needs of chem- and bio-informatics. This strategy leads to a more detailed examination of the basic components that characterise molecules and the essential predictive techniques, followed by a discussion of the emerging networked services that transcend the basic provisions, and the growing trend towards embracing modern techniques, in particular the Semantic Web. In the opinion of the authors, the issues that require community action are: increasing the amount of chemical data available for open access; validating the data as provided; and developing more efficient links between the worlds of cheminformatics and bioinformatics. The goal is to create ever better drug design services.
Rahpeyma, Mehdi; Fotouhi, Fatemeh; Makvandi, Manouchehr; Ghadiri, Ata; Samarbaf-Zadeh, Alireza
2015-11-01
Crimean-Congo hemorrhagic fever virus (CCHFV) is a member of the nairovirus, a genus in the Bunyaviridae family, which causes a life threatening disease in human. Currently, there is no vaccine against CCHFV and detailed structural analysis of CCHFV proteins remains undefined. The CCHFV M RNA segment encodes two viral surface glycoproteins known as Gn and Gc. Viral glycoproteins can be considered as key targets for vaccine development. The current study aimed to investigate structural bioinformatics of CCHFV Gn protein and design a construct to make a recombinant bacmid to express by baculovirus system. To express the Gn protein in insect cells that can be used as antigen in animal model vaccine studies. Bioinformatic analysis of CCHFV Gn protein was performed and designed a construct and cloned into pFastBacHTb vector and a recombinant Gn-bacmid was generated by Bac to Bac system. Primary, secondary, and 3D structure of CCHFV Gn were obtained and PCR reaction with M13 forward and reverse primers confirmed the generation of recombinant bacmid DNA harboring Gn coding region under polyhedron promoter. Characterization of the detailed structure of CCHFV Gn by bioinformatics software provides the basis for development of new experiments and construction of a recombinant bacmid harboring CCHFV Gn, which is valuable for designing a recombinant vaccine against deadly pathogens like CCHFV.
A Survey of Bioinformatics Database and Software Usage through Mining the Literature.
Duck, Geraint; Nenadic, Goran; Filannino, Michele; Brass, Andy; Robertson, David L; Stevens, Robert
2016-01-01
Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT), though some are instead seeing rapid growth (e.g., the GO, R). We find a striking imbalance in resource usage with the top 5% of resource names (133 names) accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.
BioXSD: the common data-exchange format for everyday bioinformatics web services
Kalaš, Matúš; Puntervoll, Pæl; Joseph, Alexandre; Bartaševičiūtė, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge
2010-01-01
Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community. Contact: matus.kalas@bccs.uib.no; developers@bioxsd.org; support@bioxsd.org PMID:20823319
KDE Bioscience: platform for bioinformatics analysis workflows.
Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue
2006-08-01
Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research.
Medical libraries, bioinformatics, and networked information: a coming convergence?
Lynch, C
1999-10-01
Libraries will be changed by technological and social developments that are fueled by information technology, bioinformatics, and networked information. Libraries in highly focused settings such as the health sciences are at a pivotal point in their development as the synthesis of historically diverse and independent information sources transforms health care institutions. Boundaries are breaking down between published literature and research data, between research databases and clinical patient data, and between consumer health information and professional literature. This paper focuses on the dynamics that are occurring with networked information sources and the roles that libraries will need to play in the world of medical informatics in the early twenty-first century.
Open source tools and toolkits for bioinformatics: significance, and where are we?
Stajich, Jason E; Lapp, Hilmar
2006-09-01
This review summarizes important work in open-source bioinformatics software that has occurred over the past couple of years. The survey is intended to illustrate how programs and toolkits whose source code has been developed or released under an Open Source license have changed informatics-heavy areas of life science research. Rather than creating a comprehensive list of all tools developed over the last 2-3 years, we use a few selected projects encompassing toolkit libraries, analysis tools, data analysis environments and interoperability standards to show how freely available and modifiable open-source software can serve as the foundation for building important applications, analysis workflows and resources.
Li, Po-E; Lo, Chien-Chi; Anderson, Joseph J; Davenport, Karen W; Bishop-Lilly, Kimberly A; Xu, Yan; Ahmed, Sanaa; Feng, Shihai; Mokashi, Vishwesh P; Chain, Patrick S G
2017-01-09
Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the ease of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. This bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Macedo, Rita; Nunes, Alexandra; Portugal, Isabel; Duarte, Sílvia; Vieira, Luís; Gomes, João Paulo
2018-05-01
Whole-genome sequencing (WGS)-based bioinformatics platforms for the rapid prediction of resistance will soon be implemented in the Tuberculosis (TB) laboratory, but their accuracy assessment still needs to be strengthened. Here, we fully-sequenced a total of 54 multidrug-resistant (MDR) and five susceptible TB strains and performed, for the first time, a simultaneous evaluation of the major four free online platforms (TB Profiler, PhyResSE, Mykrobe Predictor and TGS-TB). Overall, the sensitivity of resistance prediction ranged from 84.3% using Mykrobe predictor to 95.2% using TB profiler, while specificity was higher and homogeneous among platforms. TB profiler revealed the best performance robustness (sensitivity, specificity, PPV and NPV above 95%), followed by TGS-TB (all parameters above 90%). We also observed a few discrepancies between phenotype and genotype, where, in some cases, it was possible to pin-point some "candidate" mutations (e.g., in the rpsL promoter region) highlighting the need for their confirmation through mutagenesis assays and potential review of the anti-TB genetic databases. The rampant development of the bioinformatics algorithms and the tremendously reduced time-frame until the clinician may decide for a definitive and most effective treatment will certainly trigger the technological transition where WGS-based bioinformatics platforms could replace phenotypic drug susceptibility testing for TB. Copyright © 2018 Elsevier Ltd. All rights reserved.
Karim, Md Rezaul; Michel, Audrey; Zappa, Achille; Baranov, Pavel; Sahay, Ratnesh; Rebholz-Schuhmann, Dietrich
2017-04-16
Data workflow systems (DWFSs) enable bioinformatics researchers to combine components for data access and data analytics, and to share the final data analytics approach with their collaborators. Increasingly, such systems have to cope with large-scale data, such as full genomes (about 200 GB each), public fact repositories (about 100 TB of data) and 3D imaging data at even larger scales. As moving the data becomes cumbersome, the DWFS needs to embed its processes into a cloud infrastructure, where the data are already hosted. As the standardized public data play an increasingly important role, the DWFS needs to comply with Semantic Web technologies. This advancement to DWFS would reduce overhead costs and accelerate the progress in bioinformatics research based on large-scale data and public resources, as researchers would require less specialized IT knowledge for the implementation. Furthermore, the high data growth rates in bioinformatics research drive the demand for parallel and distributed computing, which then imposes a need for scalability and high-throughput capabilities onto the DWFS. As a result, requirements for data sharing and access to public knowledge bases suggest that compliance of the DWFS with Semantic Web standards is necessary. In this article, we will analyze the existing DWFS with regard to their capabilities toward public open data use as well as large-scale computational and human interface requirements. We untangle the parameters for selecting a preferable solution for bioinformatics research with particular consideration to using cloud services and Semantic Web technologies. Our analysis leads to research guidelines and recommendations toward the development of future DWFS for the bioinformatics research community. © The Author 2017. Published by Oxford University Press.
2007-03-08
with CD3D 50848 PAR1/UBE3A Prader–Willi syndrome chromosome region 1, GMCSFRalpha precursor, IL3Ralpha precursor (CD123) Brain development...intervention programs justifiable? Emerg. Infect. Dis. 3, 83–94. iebel, U., Kindler , B., Pepperkok, R., 2004. ‘Harvester’: a fast meta search engine of human...protein resources. Bioinformatics 20, 1962–1963. iebel, U., Kindler , B., Pepperkok, R., 2005. Bioinformatic “Harvester”: a search engine for genome
Developing sustainable software solutions for bioinformatics by the “ Butterfly” paradigm
Ahmed, Zeeshan; Zeeshan, Saman; Dandekar, Thomas
2014-01-01
Software design and sustainable software engineering are essential for the long-term development of bioinformatics software. Typical challenges in an academic environment are short-term contracts, island solutions, pragmatic approaches and loose documentation. Upcoming new challenges are big data, complex data sets, software compatibility and rapid changes in data representation. Our approach to cope with these challenges consists of iterative intertwined cycles of development (“ Butterfly” paradigm) for key steps in scientific software engineering. User feedback is valued as well as software planning in a sustainable and interoperable way. Tool usage should be easy and intuitive. A middleware supports a user-friendly Graphical User Interface (GUI) as well as a database/tool development independently. We validated the approach of our own software development and compared the different design paradigms in various software solutions. PMID:25383181
Open discovery: An integrated live Linux platform of Bioinformatics tools
Vetrivel, Umashankar; Pilla, Kalabharath
2008-01-01
Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery ‐ a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. Availability The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in PMID:19238235
Crowdsourcing for bioinformatics
Good, Benjamin M.; Su, Andrew I.
2013-01-01
Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614
Tao, Yuan; Liu, Juan
2005-01-01
The Internet has already deflated our world of working and living into a very small scope, thus bringing out the concept of Earth Village, in which people could communicate and co-work though thousands' miles far away from each other. This paper describes a prototype, which is just like an Earth Lab for bioinformatics, based on Web services framework to build up a network architecture for bioinformatics research and for world wide biologists to easily implement enormous, complex processes, and effectively share and access computing resources and data, regardless of how heterogeneous the format of the data is and how decentralized and distributed these resources are around the world. A diminutive and simplified example scenario is given out to realize the prototype after that.
Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers
Brazas, Michelle D.; Ouellette, B. F. Francis
2016-01-01
Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025
Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.
Brazas, Michelle D; Ouellette, B F Francis
2016-06-01
Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.
Bioinformatics research in the Asia Pacific: a 2007 update.
Ranganathan, Shoba; Gribskov, Michael; Tan, Tin Wee
2008-01-01
We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27-30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours.
blastjs: a BLAST+ wrapper for Node.js.
Page, Martin; MacLean, Dan; Schudoma, Christian
2016-02-27
To cope with the ever-increasing amount of sequence data generated in the field of genomics, the demand for efficient and fast database searches that drive functional and structural annotation in both large- and small-scale genome projects is on the rise. The tools of the BLAST+ suite are the most widely employed bioinformatic method for these database searches. Recent trends in bioinformatics application development show an increasing number of JavaScript apps that are based on modern frameworks such as Node.js. Until now, there is no way of using database searches with the BLAST+ suite from a Node.js codebase. We developed blastjs, a Node.js library that wraps the search tools of the BLAST+ suite and thus allows to easily add significant functionality to any Node.js-based application. blastjs is a library that allows the incorporation of BLAST+ functionality into bioinformatics applications based on JavaScript and Node.js. The library was designed to be as user-friendly as possible and therefore requires only a minimal amount of code in the client application. The library is freely available under the MIT license at https://github.com/teammaclean/blastjs.
Tools and collaborative environments for bioinformatics research
Giugno, Rosalba; Pulvirenti, Alfredo
2011-01-01
Advanced research requires intensive interaction among a multitude of actors, often possessing different expertise and usually working at a distance from each other. The field of collaborative research aims to establish suitable models and technologies to properly support these interactions. In this article, we first present the reasons for an interest of Bioinformatics in this context by also suggesting some research domains that could benefit from collaborative research. We then review the principles and some of the most relevant applications of social networking, with a special attention to networks supporting scientific collaboration, by also highlighting some critical issues, such as identification of users and standardization of formats. We then introduce some systems for collaborative document creation, including wiki systems and tools for ontology development, and review some of the most interesting biological wikis. We also review the principles of Collaborative Development Environments for software and show some examples in Bioinformatics. Finally, we present the principles and some examples of Learning Management Systems. In conclusion, we try to devise some of the goals to be achieved in the short term for the exploitation of these technologies. PMID:21984743
AncestrySNPminer: A bioinformatics tool to retrieve and develop ancestry informative SNP panels
Amirisetty, Sushil; Khurana Hershey, Gurjit K.; Baye, Tesfaye M.
2012-01-01
A wealth of genomic information is available in public and private databases. However, this information is underutilized for uncovering population specific and functionally relevant markers underlying complex human traits. Given the huge amount of SNP data available from the annotation of human genetic variation, data mining is a faster and cost effective approach for investigating the number of SNPs that are informative for ancestry. In this study, we present AncestrySNPminer, the first web-based bioinformatics tool specifically designed to retrieve Ancestry Informative Markers (AIMs) from genomic data sets and link these informative markers to genes and ontological annotation classes. The tool includes an automated and simple “scripting at the click of a button” functionality that enables researchers to perform various population genomics statistical analyses methods with user friendly querying and filtering of data sets across various populations through a single web interface. AncestrySNPminer can be freely accessed at https://research.cchmc.org/mershalab/AncestrySNPminer/login.php. PMID:22584067
Emerging strengths in Asia Pacific bioinformatics.
Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee
2008-12-12
The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.
Emerging strengths in Asia Pacific bioinformatics
Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee
2008-01-01
The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20–23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts. PMID:19091008
Bio-Docklets: virtualization containers for single-step execution of NGS pipelines.
Kim, Baekdoo; Ali, Thahmina; Lijeron, Carlos; Afgan, Enis; Krampis, Konstantinos
2017-08-01
Processing of next-generation sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized postanalysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers toward seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. We present an approach for abstracting the complex data operations of multistep, bioinformatics pipelines for NGS data analysis. As examples, we have deployed 2 pipelines for RNA sequencing and chromatin immunoprecipitation sequencing, preconfigured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines as simply as running a single bioinformatics tool. This is achieved using a "meta-script" that automatically starts the Bio-Docklets and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface. The pipeline output is postprocessed by integration with the Visual Omics Explorer framework, providing interactive data visualizations that users can access through a web browser. Our goal is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts on any computing environment, whether a laboratory workstation, university computer cluster, or a cloud service provider. Beyond end users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets. © The Authors 2017. Published by Oxford University Press.
Bio-Docklets: virtualization containers for single-step execution of NGS pipelines
Kim, Baekdoo; Ali, Thahmina; Lijeron, Carlos; Afgan, Enis
2017-01-01
Abstract Processing of next-generation sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized postanalysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers toward seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform. We present an approach for abstracting the complex data operations of multistep, bioinformatics pipelines for NGS data analysis. As examples, we have deployed 2 pipelines for RNA sequencing and chromatin immunoprecipitation sequencing, preconfigured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines as simply as running a single bioinformatics tool. This is achieved using a “meta-script” that automatically starts the Bio-Docklets and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface. The pipeline output is postprocessed by integration with the Visual Omics Explorer framework, providing interactive data visualizations that users can access through a web browser. Our goal is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts on any computing environment, whether a laboratory workstation, university computer cluster, or a cloud service provider. Beyond end users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets. PMID:28854616
Rahpeyma, Mehdi; Fotouhi, Fatemeh; Makvandi, Manouchehr; Ghadiri, Ata; Samarbaf-Zadeh, Alireza
2015-01-01
Background Crimean-Congo hemorrhagic fever virus (CCHFV) is a member of the nairovirus, a genus in the Bunyaviridae family, which causes a life threatening disease in human. Currently, there is no vaccine against CCHFV and detailed structural analysis of CCHFV proteins remains undefined. The CCHFV M RNA segment encodes two viral surface glycoproteins known as Gn and Gc. Viral glycoproteins can be considered as key targets for vaccine development. Objectives The current study aimed to investigate structural bioinformatics of CCHFV Gn protein and design a construct to make a recombinant bacmid to express by baculovirus system. Materials and Methods To express the Gn protein in insect cells that can be used as antigen in animal model vaccine studies. Bioinformatic analysis of CCHFV Gn protein was performed and designed a construct and cloned into pFastBacHTb vector and a recombinant Gn-bacmid was generated by Bac to Bac system. Results Primary, secondary, and 3D structure of CCHFV Gn were obtained and PCR reaction with M13 forward and reverse primers confirmed the generation of recombinant bacmid DNA harboring Gn coding region under polyhedron promoter. Conclusions Characterization of the detailed structure of CCHFV Gn by bioinformatics software provides the basis for development of new experiments and construction of a recombinant bacmid harboring CCHFV Gn, which is valuable for designing a recombinant vaccine against deadly pathogens like CCHFV. PMID:26862379
BioXSD: the common data-exchange format for everyday bioinformatics web services.
Kalas, Matús; Puntervoll, Pål; Joseph, Alexandre; Bartaseviciūte, Edita; Töpfer, Armin; Venkataraman, Prabakar; Pettifer, Steve; Bryne, Jan Christian; Ison, Jon; Blanchet, Christophe; Rapacki, Kristoffer; Jonassen, Inge
2010-09-15
The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community.
Extending Asia Pacific bioinformatics into new realms in the "-omics" era.
Ranganathan, Shoba; Eisenhaber, Frank; Tong, Joo Chuan; Tan, Tin Wee
2009-12-03
The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation dating back to 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 7-11, 2009 at Biopolis, Singapore. Besides bringing together scientists from the field of bioinformatics in this region, InCoB has actively engaged clinicians and researchers from the area of systems biology, to facilitate greater synergy between these two groups. InCoB2009 followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India), Hong Kong and Taipei (Taiwan), with InCoB2010 scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. The Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and symposia on Clinical Bioinformatics (CBAS), the Singapore Symposium on Computational Biology (SYMBIO) and training tutorials were scheduled prior to the scientific meeting, and provided ample opportunity for in-depth learning and special interest meetings for educators, clinicians and students. We provide a brief overview of the peer-reviewed bioinformatics manuscripts accepted for publication in this supplement, grouped into thematic areas. In order to facilitate scientific reproducibility and accountability, we have, for the first time, introduced minimum information criteria for our pubilcations, including compliance to a Minimum Information about a Bioinformatics Investigation (MIABi). As the regional research expertise in bioinformatics matures, we have delineated a minimum set of bioinformatics skills required for addressing the computational challenges of the "-omics" era.
NASA Astrophysics Data System (ADS)
Roche-Lima, Abiel; Thulasiram, Ruppa K.
2012-02-01
Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
Lin, Jing; Bruni, Francesca M.; Fu, Zhiyan; Maloney, Jennifer; Bardina, Ludmilla; Boner, Attilio L.; Gimenez, Gustavo; Sampson, Hugh A.
2013-01-01
Background Peanut allergy is relatively common, typically permanent, and often severe. Double-blind, placebo-controlled food challenge is considered the gold standard for the diagnosis of food allergy–related disorders. However, the complexity and potential of double-blind, placebo-controlled food challenge to cause life-threatening allergic reactions affects its clinical application. A laboratory test that could accurately diagnose symptomatic peanut allergy would greatly facilitate clinical practice. Objective We sought to develop an allergy diagnostic method that could correctly predict symptomatic peanut allergy by using peptide microarray immunoassays and bioinformatic methods. Methods Microarray immunoassays were performed by using the sera from 62 patients (31 with symptomatic peanut allergy and 31 who had outgrown their peanut allergy or were sensitized but were clinically tolerant to peanut). Specific IgE and IgG4 binding to 419 overlapping peptides (15 mers, 3 offset) covering the amino acid sequences of Ara h 1, Ara h 2, and Ara h 3 were measured by using a peptide microarray immunoassay. Bioinformatic methods were applied for data analysis. Results Individuals with peanut allergy showed significantly greater IgE binding and broader epitope diversity than did peanut-tolerant individuals. No significant difference in IgG4 binding was found between groups. By using machine learning methods, 4 peptide biomarkers were identified and prediction models that can predict the outcome of double-blind, placebo-controlled food challenges with high accuracy were developed by using a combination of the biomarkers. Conclusions In this study, we developed a novel diagnostic approach that can predict peanut allergy with high accuracy by combining the results of a peptide microarray immunoassay and bioinformatic methods. Further studies are needed to validate the efficacy of this assay in clinical practice. PMID:22444503
A comparison of common programming languages used in bioinformatics.
Fourment, Mathieu; Gillings, Michael R
2008-02-05
The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.
Interoperability of GADU in using heterogeneous Grid resources for bioinformatics applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sulakhe, D.; Rodriguez, A.; Wilde, M.
2008-03-01
Bioinformatics tools used for efficient and computationally intensive analysis of genetic sequences require large-scale computational resources to accommodate the growing data. Grid computational resources such as the Open Science Grid and TeraGrid have proved useful for scientific discovery. The genome analysis and database update system (GADU) is a high-throughput computational system developed to automate the steps involved in accessing the Grid resources for running bioinformatics applications. This paper describes the requirements for building an automated scalable system such as GADU that can run jobs on different Grids. The paper describes the resource-independent configuration of GADU using the Pegasus-based virtual datamore » system that makes high-throughput computational tools interoperable on heterogeneous Grid resources. The paper also highlights the features implemented to make GADU a gateway to computationally intensive bioinformatics applications on the Grid. The paper will not go into the details of problems involved or the lessons learned in using individual Grid resources as it has already been published in our paper on genome analysis research environment (GNARE) and will focus primarily on the architecture that makes GADU resource independent and interoperable across heterogeneous Grid resources.« less
Models@Home: distributed computing in bioinformatics using a screensaver based approach.
Krieger, Elmar; Vriend, Gert
2002-02-01
Due to the steadily growing computational demands in bioinformatics and related scientific disciplines, one is forced to make optimal use of the available resources. A straightforward solution is to build a network of idle computers and let each of them work on a small piece of a scientific challenge, as done by Seti@Home (http://setiathome.berkeley.edu), the world's largest distributed computing project. We developed a generally applicable distributed computing solution that uses a screensaver system similar to Seti@Home. The software exploits the coarse-grained nature of typical bioinformatics projects. Three major considerations for the design were: (1) often, many different programs are needed, while the time is lacking to parallelize them. Models@Home can run any program in parallel without modifications to the source code; (2) in contrast to the Seti project, bioinformatics applications are normally more sensitive to lost jobs. Models@Home therefore includes stringent control over job scheduling; (3) to allow use in heterogeneous environments, Linux and Windows based workstations can be combined with dedicated PCs to build a homogeneous cluster. We present three practical applications of Models@Home, running the modeling programs WHAT IF and YASARA on 30 PCs: force field parameterization, molecular dynamics docking, and database maintenance.
Robust enzyme design: bioinformatic tools for improved protein stability.
Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas
2015-03-01
The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Accessing and integrating data and knowledge for biomedical research.
Burgun, A; Bodenreider, O
2008-01-01
To review the issues that have arisen with the advent of translational research in terms of integration of data and knowledge, and survey current efforts to address these issues. Using examples form the biomedical literature, we identified new trends in biomedical research and their impact on bioinformatics. We analyzed the requirements for effective knowledge repositories and studied issues in the integration of biomedical knowledge. New diagnostic and therapeutic approaches based on gene expression patterns have brought about new issues in the statistical analysis of data, and new workflows are needed are needed to support translational research. Interoperable data repositories based on standard annotations, infrastructures and services are needed to support the pooling and meta-analysis of data, as well as their comparison to earlier experiments. High-quality, integrated ontologies and knowledge bases serve as a source of prior knowledge used in combination with traditional data mining techniques and contribute to the development of more effective data analysis strategies. As biomedical research evolves from traditional clinical and biological investigations towards omics sciences and translational research, specific needs have emerged, including integrating data collected in research studies with patient clinical data, linking omics knowledge with medical knowledge, modeling the molecular basis of diseases, and developing tools that support in-depth analysis of research data. As such, translational research illustrates the need to bridge the gap between bioinformatics and medical informatics, and opens new avenues for biomedical informatics research.
Mertz, Pamela; Streu, Craig
2015-01-01
This article describes a synergistic two-semester writing sequence for biochemistry courses. In the first semester, students select a putative protein and are tasked with researching their protein largely through bioinformatics resources. In the second semester, students develop original ideas and present them in the form of a research grant proposal. Both projects involve multiple drafts and peer review. The complementarity of the projects increases student exposure to bioinformatics and literature resources, fosters higher-order thinking skills, and develops teamwork and communication skills. Student feedback and responses on perception surveys demonstrated that the students viewed both projects as favorable learning experiences. © 2015 The International Union of Biochemistry and Molecular Biology.
Bioinformatics Approaches for Fetal DNA Fraction Estimation in Noninvasive Prenatal Testing
Peng, Xianlu Laura; Jiang, Peiyong
2017-01-01
The discovery of cell-free fetal DNA molecules in plasma of pregnant women has created a paradigm shift in noninvasive prenatal testing (NIPT). Circulating cell-free DNA in maternal plasma has been increasingly recognized as an important proxy to detect fetal abnormalities in a noninvasive manner. A variety of approaches for NIPT using next-generation sequencing have been developed, which have been rapidly transforming clinical practices nowadays. In such approaches, the fetal DNA fraction is a pivotal parameter governing the overall performance and guaranteeing the proper clinical interpretation of testing results. In this review, we describe the current bioinformatics approaches developed for estimating the fetal DNA fraction and discuss their pros and cons. PMID:28230760
Bioinformatics Approaches for Fetal DNA Fraction Estimation in Noninvasive Prenatal Testing.
Peng, Xianlu Laura; Jiang, Peiyong
2017-02-20
The discovery of cell-free fetal DNA molecules in plasma of pregnant women has created a paradigm shift in noninvasive prenatal testing (NIPT). Circulating cell-free DNA in maternal plasma has been increasingly recognized as an important proxy to detect fetal abnormalities in a noninvasive manner. A variety of approaches for NIPT using next-generation sequencing have been developed, which have been rapidly transforming clinical practices nowadays. In such approaches, the fetal DNA fraction is a pivotal parameter governing the overall performance and guaranteeing the proper clinical interpretation of testing results. In this review, we describe the current bioinformatics approaches developed for estimating the fetal DNA fraction and discuss their pros and cons.
The Instituto Gulbenkian de Ciência and its Outreach
Leão, Maria João; Godinho, Ana; Fernandes, Pedro
2012-01-01
The Instituto Gulbenkian de Ciência (IGC) is biomedical research institute that acts as a host institution for small research groups, in Portugal. Most of its activities reach out to the scientific community in several ways. The IGC organizes regular series of seminars with invited international speakers, workshops, courses and conferences, and an in-house PhD programme. Specific outreach needs had to be met in the two instances that are described here. GTPB The Gulbenkian Training Programme in Bioinformatics (GTPB) started as a regular activity in 1999 in response to the demand of users seeking opportunities to acquire hands-on practical skills in Bioinformatics in an effective way. Training provision in Bioinformatics requires the conciliation of a variety of interests into a series of highly effective training events, in which scientists can acquire skills and a high degree of independence in their usage. The GTPB programme currently offers more than 30 themes, of which 15 to 20 are chosen for single events in each year. The GTPB has provided training to more than 2000 researchers and students, so far. IGC Outreach A dedicated outreach programme targets science education and public engagement in science, for different audience groups. The aim of the outreach programme is to promote scientific literacy, foster careers in science and empower citizens to engage in cutting-edge biomedical research. Activities include Open Days, seminars and laboratory workshops for teachers, development of online, multimedia and hard-copy resources and experimental protocols to be used in schools, visits to schools with hands-on experiments and career talks by researchers and facility staff. Less conventional outreach activities include direct participation in venues for the general public (such a a music festival, for example) have created unexpected opportunities for fundraising and direct financial support for students engaged in research projects.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lo, Chien-Chi
2015-08-03
Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen ormore » co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance« less
Workflows for microarray data processing in the Kepler environment.
Stropp, Thomas; McPhillips, Timothy; Ludäscher, Bertram; Bieda, Mark
2012-05-17
Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R/BioConductor scripting approaches to pipeline design. Finally, we suggest that microarray data processing task workflows may provide a basis for future example-based comparison of different workflow systems. We provide a set of tools and complete workflows for microarray data analysis in the Kepler environment, which has the advantages of offering graphical, clear display of conceptual steps and parameters and the ability to easily integrate other resources such as remote data and web services.
GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.
Davis, Sean; Meltzer, Paul S
2007-07-15
Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. GEOquery is available as part of the BioConductor project.
ERIC Educational Resources Information Center
Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany
2007-01-01
We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…
ERIC Educational Resources Information Center
Shachak, Aviv; Ophir, Ron; Rubin, Eitan
2005-01-01
The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…
ERIC Educational Resources Information Center
Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.
2009-01-01
Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…
Lawlor, Brendan; Walsh, Paul
2015-01-01
There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.
Lawlor, Brendan; Walsh, Paul
2015-01-01
There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054
regSNPs: a strategy for prioritizing regulatory single nucleotide substitutions
Teng, Mingxiang; Ichikawa, Shoji; Padgett, Leah R.; Wang, Yadong; Mort, Matthew; Cooper, David N.; Koller, Daniel L.; Foroud, Tatiana; Edenberg, Howard J.; Econs, Michael J.; Liu, Yunlong
2012-01-01
Motivation: One of the fundamental questions in genetics study is to identify functional DNA variants that are responsible to a disease or phenotype of interest. Results from large-scale genetics studies, such as genome-wide association studies (GWAS), and the availability of high-throughput sequencing technologies provide opportunities in identifying causal variants. Despite the technical advances, informatics methodologies need to be developed to prioritize thousands of variants for potential causative effects. Results: We present regSNPs, an informatics strategy that integrates several established bioinformatics tools, for prioritizing regulatory SNPs, i.e. the SNPs in the promoter regions that potentially affect phenotype through changing transcription of downstream genes. Comparing to existing tools, regSNPs has two distinct features. It considers degenerative features of binding motifs by calculating the differences on the binding affinity caused by the candidate variants and integrates potential phenotypic effects of various transcription factors. When tested by using the disease-causing variants documented in the Human Gene Mutation Database, regSNPs showed mixed performance on various diseases. regSNPs predicted three SNPs that can potentially affect bone density in a region detected in an earlier linkage study. Potential effects of one of the variants were validated using luciferase reporter assay. Contact: yunliu@iupui.edu Supplementary information: Supplementary data are available at Bioinformatics online PMID:22611130
Menegidio, Fabiano B; Jabes, Daniela L; Costa de Oliveira, Regina; Nunes, Luiz R
2018-02-01
This manuscript introduces and describes Dugong, a Docker image based on Ubuntu 16.04, which automates installation of more than 3500 bioinformatics tools (along with their respective libraries and dependencies), in alternative computational environments. The software operates through a user-friendly XFCE4 graphic interface that allows software management and installation by users not fully familiarized with the Linux command line and provides the Jupyter Notebook to assist in the delivery and exchange of consistent and reproducible protocols and results across laboratories, assisting in the development of open science projects. Source code and instructions for local installation are available at https://github.com/DugongBioinformatics, under the MIT open source license. Luiz.nunes@ufabc.edu.br. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Bioinformatics Meets Virology: The European Virus Bioinformatics Center's Second Annual Meeting.
Ibrahim, Bashar; Arkhipova, Ksenia; Andeweg, Arno C; Posada-Céspedes, Susana; Enault, François; Gruber, Arthur; Koonin, Eugene V; Kupczok, Anne; Lemey, Philippe; McHardy, Alice C; McMahon, Dino P; Pickett, Brett E; Robertson, David L; Scheuermann, Richard H; Zhernakova, Alexandra; Zwart, Mark P; Schönhuth, Alexander; Dutilh, Bas E; Marz, Manja
2018-05-14
The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new developments and novel research findings that emerged during the meeting.
Lan, D; Hu, Y D; Zhu, Q; Li, D Y; Liu, Y P
2015-07-28
The direction of production for indigenous chicken breeds is currently unknown and this knowledge, combined with the development of chicken genome-wide association studies, led us to investigate differences in specific loci between broiler and layer chicken using bioinformatic methods. In addition, we analyzed the distribution of these seven identified loci in four Chinese indigenous chicken breeds, Caoke chicken, Jiuyuan chicken, Sichuan mountain chicken, and Tibetan chicken, using DNA direct sequencing methods, and analyzed the data using bioinformatic methods. Based on the results, we suggest that Caoke chicken could be developed for meat production, while Jiuyuan chicken could be developed for egg production. As Sichuan mountain chicken and Tibetan chicken exhibited large polymorphisms, these breeds could be improved by changing their living environment.
d'Acierno, Antonio; Facchiano, Angelo; Marabotti, Anna
2009-06-01
We describe the GALT-Prot database and its related web-based application that have been developed to collect information about the structural and functional effects of mutations on the human enzyme galactose-1-phosphate uridyltransferase (GALT) involved in the genetic disease named galactosemia type I. Besides a list of missense mutations at gene and protein sequence levels, GALT-Prot reports the analysis results of mutant GALT structures. In addition to the structural information about the wild-type enzyme, the database also includes structures of over 100 single point mutants simulated by means of a computational procedure, and the analysis to each mutant was made with several bioinformatics programs in order to investigate the effect of the mutations. The web-based interface allows querying of the database, and several links are also provided in order to guarantee a high integration with other resources already present on the web. Moreover, the architecture of the database and the web application is flexible and can be easily adapted to store data related to other proteins with point mutations. GALT-Prot is freely available at http://bioinformatica.isa.cnr.it/GALT/.
NASA Astrophysics Data System (ADS)
Symeonidis, Iphigenia Sofia
This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.
Translational Biomedical Informatics in the Cloud: Present and Future
Chen, Jiajia; Qian, Fuliang; Yan, Wenying; Shen, Bairong
2013-01-01
Next generation sequencing and other high-throughput experimental techniques of recent decades have driven the exponential growth in publicly available molecular and clinical data. This information explosion has prepared the ground for the development of translational bioinformatics. The scale and dimensionality of data, however, pose obvious challenges in data mining, storage, and integration. In this paper we demonstrated the utility and promise of cloud computing for tackling the big data problems. We also outline our vision that cloud computing could be an enabling tool to facilitate translational bioinformatics research. PMID:23586054
Biopython: freely available Python tools for computational molecular biology and bioinformatics.
Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T; Chapman, Brad A; Cox, Cymon J; Dalke, Andrew; Friedberg, Iddo; Hamelryck, Thomas; Kauff, Frank; Wilczynski, Bartek; de Hoon, Michiel J L
2009-06-01
The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. Biopython is freely available, with documentation and source code at (www.biopython.org) under the Biopython license.
Modern Computational Techniques for the HMMER Sequence Analysis
2013-01-01
This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
Integrating grant-funded research into the undergraduate biology curriculum using IMG-ACT.
Ditty, Jayna L; Williams, Kayla M; Keller, Megan M; Chen, Grischa Y; Liu, Xianxian; Parales, Rebecca E
2013-01-01
It has become clear in current scientific pedagogy that the emersion of students in the scientific process in terms of designing, implementing, and analyzing experiments is imperative for their education; as such, it has been our goal to model this active learning process in the classroom and laboratory in the context of a genuine scientific question. Toward this objective, the National Science Foundation funded a collaborative research grant between a primarily undergraduate institution and a research-intensive institution to study the chemotactic responses of the bacterium Pseudomonas putida F1. As part of the project, a new Bioinformatics course was developed in which undergraduates annotate relevant regions of the P. putida F1 genome using Integrated Microbial Genomes Annotation Collaboration Toolkit, a bioinformatics interface specifically developed for undergraduate programs by the Department of Energy Joint Genome Institute. Based on annotations of putative chemotaxis genes in P. putida F1 and comparative genomics studies, undergraduate students from both institutions developed functional genomics research projects that evolved from the annotations. The purpose of this study is to describe the nature of the NSF grant, the development of the Bioinformatics lecture and wet laboratory course, and how undergraduate student involvement in the project that was initiated in the classroom has served as a springboard for independent undergraduate research projects. Copyright © 2012 International Union of Biochemistry and Molecular Biology, Inc.
Rising Strengths Hong Kong SAR in Bioinformatics.
Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy
2017-06-01
Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.
Ferraro Petrillo, Umberto; Roscigno, Gianluca; Cattaneo, Giuseppe; Giancarlo, Raffaele
2018-06-01
Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in the realm of genome assembly. However, they are so specialized to this domain that they do not extend easily to the computation of informational and linguistic indices, concurrently on sets of genomes. Following the well-established approach in many disciplines, and with a growing success also in bioinformatics, to resort to MapReduce and Hadoop to deal with 'Big Data' problems, we present KCH, the first set of MapReduce algorithms able to perform concurrently informational and linguistic analysis of large collections of genomic sequences on a Hadoop cluster. The benchmarking of KCH that we provide indicates that it is quite effective and versatile. It is also competitive with respect to the parallel and distributed algorithms highly specialized to k-mer statistics collection for genome assembly problems. In conclusion, KCH is a much needed addition to the growing number of algorithms and tools that use MapReduce for bioinformatics core applications. The software, including instructions for running it over Amazon AWS, as well as the datasets are available at http://www.di-srv.unisa.it/KCH. umberto.ferraro@uniroma1.it. Supplementary data are available at Bioinformatics online.
Xie, Qingjun; Tzfadia, Oren; Levy, Matan; Weithorn, Efrat; Peled-Zehavi, Hadas; Van Parys, Thomas; Van de Peer, Yves; Galili, Gad
2016-01-01
ABSTRACT Most of the proteins that are specifically turned over by selective autophagy are recognized by the presence of short Atg8 interacting motifs (AIMs) that facilitate their association with the autophagy apparatus. Such AIMs can be identified by bioinformatics methods based on their defined degenerate consensus F/W/Y-X-X-L/I/V sequences in which X represents any amino acid. Achieving reliability and/or fidelity of the prediction of such AIMs on a genome-wide scale represents a major challenge. Here, we present a bioinformatics approach, high fidelity AIM (hfAIM), which uses additional sequence requirements—the presence of acidic amino acids and the absence of positively charged amino acids in certain positions—to reliably identify AIMs in proteins. We demonstrate that the use of the hfAIM method allows for in silico high fidelity prediction of AIMs in AIM-containing proteins (ACPs) on a genome-wide scale in various organisms. Furthermore, by using hfAIM to identify putative AIMs in the Arabidopsis proteome, we illustrate a potential contribution of selective autophagy to various biological processes. More specifically, we identified 9 peroxisomal PEX proteins that contain hfAIM motifs, among which AtPEX1, AtPEX6 and AtPEX10 possess evolutionary-conserved AIMs. Bimolecular fluorescence complementation (BiFC) results verified that AtPEX6 and AtPEX10 indeed interact with Atg8 in planta. In addition, we show that mutations occurring within or nearby hfAIMs in PEX1, PEX6 and PEX10 caused defects in the growth and development of various organisms. Taken together, the above results suggest that the hfAIM tool can be used to effectively perform genome-wide in silico screens of proteins that are potentially regulated by selective autophagy. The hfAIM system is a web tool that can be accessed at link: http://bioinformatics.psb.ugent.be/hfAIM/. PMID:27071037
Zarei, Neda; Fazeli, Mehdi; Mohammadi, Mozafar; Nejatollahi, Foroogh
2018-06-01
FZD7 has a critical role as a surface receptor of Wnt/β-catenin signaling in cancer cells. Suppressing Wnt signaling through blocking FZD7 is shown to decrease cell viability, metastasis and invasion. Bioinformatic methods have been a powerful tool in epitope designing studies. Small size, high affinity and human origin of scFv antibodies have provided unique advantages for these recombinant antibodies. Two epitopes from extracellular domain of FZD7 were designed using bioinformatic methods. Specific anti-FZD7 scFvs were selected against these epitopes through panning process. The specificity of the scFvs was assessed by phage ELISA and the ability to bind to FZD7 expressing cell line (MDA-MB-231) was determined by flowcytometry. Antiproliferative and apoptotic effects of the scFvs were evaluated by MTT and Annexin V/PI assays. The effects of selected scFvs on expression level of Surivin, c-Myc and Dvl genes were also evaluated by real-time PCR. Results demonstrated selection of two specific scFvs (scFv-I and scFv-II) with frequencies of 35 and 20%. Both antibodies bound to the corresponding peptides and cell surface receptors as shown by phage ELISA and flowcytometry, respectively. The scFvs inhibited cell growth of MDA-MB-231 cells significantly as compared to untreated cells. Growth inhibition of 58.6 and 53.1% were detected for scFv-I and scFv-II, respectively. No significant growth inhibition was detected for SKBR-3 negative control cells. The scFvs induced apoptotic effects in the MDA-MB-231 treated cells after 48 h, which were 81.6 and 74.9% for scFv-I and scFv-II, respectively. Downregulation of Surivin, c-Myc and Dvl genes were also shown after 48h treatment of cells with either of scFvs (59.3-93.8%). ScFv-I showed significant higher antiproliferative and apoptotic effects than scFv-II. Bioinformatic methods could effectively select potential epitopes of FZD7 protein and suggest that epitope designing by bioinformatic methods could contribute to the selection of key antigens for cancer immunotherapy. The selected scFvs, especially scFv-I, with high antiproliferative and apoptotic effects could be considered as effective agents for immunotherapy of cancers expressing FZD7 receptor including triple negative breast cancer.
Lau, Joann M; Robinson, David L
2009-01-01
With rapid advances in biotechnology and molecular biology, instructors are challenged to not only provide undergraduate students with hands-on experiences in these disciplines but also to engage them in the "real-world" scientific process. Two common topics covered in biotechnology or molecular biology courses are gene-cloning and bioinformatics, but to provide students with a continuous laboratory-based research experience in these techniques is difficult. To meet these challenges, we have partnered with Bio-Rad Laboratories in the development of the "Cloning and Sequencing Explorer Series," which combines wet-lab experiences (e.g., DNA extraction, polymerase chain reaction, ligation, transformation, and restriction digestion) with bioinformatics analysis (e.g., evaluation of DNA sequence quality, sequence editing, Basic Local Alignment Search Tool searches, contig construction, intron identification, and six-frame translation) to produce a sequence publishable in the National Center for Biotechnology Information GenBank. This 6- to 8-wk project-based exercise focuses on a pivotal gene of glycolysis (glyceraldehyde-3-phosphate dehydrogenase), in which students isolate, sequence, and characterize the gene from a plant species or cultivar not yet published in GenBank. Student achievement was evaluated using pre-, mid-, and final-test assessments, as well as with a survey to assess student perceptions. Student confidence with basic laboratory techniques and knowledge of bioinformatics tools were significantly increased upon completion of this hands-on exercise.
A Web-based assessment of bioinformatics end-user support services at US universities.
Messersmith, Donna J; Benson, Dennis A; Geer, Renata C
2006-07-01
This study was conducted to gauge the availability of bioinformatics end-user support services at US universities and to identify the providers of those services. The study primarily focused on the availability of short-term workshops that introduce users to molecular biology databases and analysis software. Websites of selected US universities were reviewed to determine if bioinformatics educational workshops were offered, and, if so, what organizational units in the universities provided them. Of 239 reviewed universities, 72 (30%) offered bioinformatics educational workshops. These workshops were located at libraries (N = 15), bioinformatics centers (N = 38), or other facilities (N = 35). No such training was noted on the sites of 167 universities (70%). Of the 115 bioinformatics centers identified, two-thirds did not offer workshops. This analysis of university Websites indicates that a gap may exist in the availability of workshops and related training to assist researchers in the use of bioinformatics resources, representing a potential opportunity for libraries and other facilities to provide training and assistance for this growing user group.
Application of pharmacogenomics to vaccines
Poland, Gregory A; Ovsyannikova, Inna G; Jacobson, Robert M
2009-01-01
The field of pharmacogenomics and pharmacogenetics provides a promising science base for vaccine research and development. A broad range of phenotype/genotype data combined with high-throughput genetic sequencing and bioinformatics are increasingly being integrated into this emerging field of vaccinomics. This paper discusses the hypothesis of the ‘immune response gene network’ and genetic (and bioinformatic) strategies to study associations between immune response gene polymorphisms and variations in humoral and cellular immune responses to prophylactic viral vaccines, such as measles–mumps–rubella, influenza, HIV, hepatitis B and smallpox. Immunogenetic studies reveal promising new vaccine targets by providing a better understanding of the mechanisms by which gene polymorphisms may influence innate and adaptive immune responses to vaccines, including vaccine failure and vaccine-associated adverse events. Additional benefits from vaccinomic studies include the development of personalized vaccines, the development of novel vaccines and the development of novel vaccine adjuvants. PMID:19450131
Bioinformatics clouds for big data manipulation.
Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang
2012-11-28
As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.
Malaria vaccines: high-throughput tools for antigens discovery with potential for their development
Céspedes, Nora; Vallejo, Andrés; Arévalo-Herrera, Myriam
2013-01-01
Malaria is a disease induced by parasites of the Plasmodium genus, which are transmitted by Anopheles mosquitoes and represents a great socio-economic burden Worldwide. Plasmodium vivax is the second species of malaria Worldwide, but it is the most prevalent in Latin America and other regions of the planet. It is currently considered that vaccines represent a cost-effective strategy for controlling transmissible diseases and could complement other malaria control measures; however, the chemical and immunological complexity of the parasite has hindered development of effective vaccines. Recent availability of several genomes of Plasmodium species, as well as bioinformatic tools are allowing the selection of large numbers of proteins and analysis of their immune potential. Herein, we review recently developed strategies for discovery of novel antigens with potential for malaria vaccine development. PMID:24892459
Simple re-instantiation of small databases using cloud computing.
Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M
2013-01-01
Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.
Simple re-instantiation of small databases using cloud computing
2013-01-01
Background Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. Results We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Conclusions Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear. PMID:24564380
A review of bioinformatic methods for forensic DNA analyses.
Liu, Yao-Yuan; Harbison, SallyAnn
2018-03-01
Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.
A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research
Campbell, Chad E.; Nehm, Ross H.
2013-01-01
The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (<10%) of studies provided any reliability or validity evidence, and only one study out of the 95 sampled mentioned both validity and reliability. Our findings raise concerns about the quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE. PMID:24006400
A bioinformatics roadmap for the human vaccines project.
Scheuermann, Richard H; Sinkovits, Robert S; Schenkelberg, Theodore; Koff, Wayne C
2017-06-01
Biomedical research has become a data intensive science in which high throughput experimentation is producing comprehensive data about biological systems at an ever-increasing pace. The Human Vaccines Project is a new public-private partnership, with the goal of accelerating development of improved vaccines and immunotherapies for global infectious diseases and cancers by decoding the human immune system. To achieve its mission, the Project is developing a Bioinformatics Hub as an open-source, multidisciplinary effort with the overarching goal of providing an enabling infrastructure to support the data processing, analysis and knowledge extraction procedures required to translate high throughput, high complexity human immunology research data into biomedical knowledge, to determine the core principles driving specific and durable protective immune responses.
Xie, Bing; Huang, Yu; Baumann, Kate; Fry, Bryan Grieg; Shi, Qiong
2017-01-01
The potential of marine natural products to become new drugs is vast; however, research is still in its infancy. The chemical and biological diversity of marine toxins is immeasurable and as such an extraordinary resource for the discovery of new drugs. With the rapid development of next-generation sequencing (NGS) and liquid chromatography–tandem mass spectrometry (LC-MS/MS), it has been much easier and faster to identify more toxins and predict their functions with bioinformatics pipelines, which pave the way for novel drug developments. Here we provide an overview of related bioinformatics pipelines that have been supported by a combination of transcriptomics and proteomics for identification and function prediction of novel marine toxins. PMID:28358320
Xie, Bing; Huang, Yu; Baumann, Kate; Fry, Bryan Grieg; Shi, Qiong
2017-03-30
The potential of marine natural products to become new drugs is vast; however, research is still in its infancy. The chemical and biological diversity of marine toxins is immeasurable and as such an extraordinary resource for the discovery of new drugs. With the rapid development of next-generation sequencing (NGS) and liquid chromatography-tandem mass spectrometry (LC-MS/MS), it has been much easier and faster to identify more toxins and predict their functions with bioinformatics pipelines, which pave the way for novel drug developments. Here we provide an overview of related bioinformatics pipelines that have been supported by a combination of transcriptomics and proteomics for identification and function prediction of novel marine toxins.
NASA Astrophysics Data System (ADS)
Wefer, Stephen H.
The proliferation of bioinformatics in modern Biology marks a new revolution in science, which promises to influence science education at all levels. This thesis examined state standards for content that articulated bioinformatics, and explored secondary students' affective and cognitive perceptions of, and performance in, a bioinformatics mini-unit. The results are presented as three studies. The first study analyzed secondary science standards of 49 U.S States (Iowa has no science framework) and the District of Columbia for content related to bioinformatics at the introductory high school biology level. The bionformatics content of each state's Biology standards were categorized into nine areas and the prevalence of each area documented. The nine areas were: The Human Genome Project, Forensics, Evolution, Classification, Nucleotide Variations, Medicine, Computer Use, Agriculture/Food Technology, and Science Technology and Society/Socioscientific Issues (STS/SSI). Findings indicated a generally low representation of bioinformatics related content, which varied substantially across the different areas. Recommendations are made for reworking existing standards to incorporate bioinformatics and to facilitate the goal of promoting science literacy in this emerging new field among secondary school students. The second study examined thirty-two students' affective responses to, and content mastery of, a two-week bioinformatics mini-unit. The findings indicate that the students generally were positive relative to their interest level, the usefulness of the lessons, the difficulty level of the lessons, likeliness to engage in additional bioinformatics, and were overall successful on the assessments. A discussion of the results and significance is followed by suggestions for future research and implementation for transferability. The third study presents a case study of individual differences among ten secondary school students, whose cognitive and affective percepts were analyzed in relation to their experience in learning a bioinformatics mini-unit. There were distinct individual differences among the participants, especially in the way they processed information and integrated procedural and analytical thought during bioinformatics learning. These differences may provide insights into some of the specific needs of students that educators and curriculum designers should consider when designing bioinformatics learning experiences. Implications for teacher education and curriculum design are presented in addition to some suggestions for further research.
Novel approaches for bioinformatic analysis of salivary RNA sequencing data for development.
Kaczor-Urbanowicz, Karolina Elzbieta; Kim, Yong; Li, Feng; Galeev, Timur; Kitchen, Rob R; Gerstein, Mark; Koyano, Kikuye; Jeong, Sung-Hee; Wang, Xiaoyan; Elashoff, David; Kang, So Young; Kim, Su Mi; Kim, Kyoung; Kim, Sung; Chia, David; Xiao, Xinshu; Rozowsky, Joel; Wong, David T W
2018-01-01
Analysis of RNA sequencing (RNA-Seq) data in human saliva is challenging. Lack of standardization and unification of the bioinformatic procedures undermines saliva's diagnostic potential. Thus, it motivated us to perform this study. We applied principal pipelines for bioinformatic analysis of small RNA-Seq data of saliva of 98 healthy Korean volunteers including either direct or indirect mapping of the reads to the human genome using Bowtie1. Analysis of alignments to exogenous genomes by another pipeline revealed that almost all of the reads map to bacterial genomes. Thus, salivary exRNA has fundamental properties that warrant the design of unique additional steps while performing the bioinformatic analysis. Our pipelines can serve as potential guidelines for processing of RNA-Seq data of human saliva. Processing and analysis results of the experimental data generated by the exceRpt (v4.6.3) small RNA-seq pipeline (github.gersteinlab.org/exceRpt) are available from exRNA atlas (exrna-atlas.org). Alignment to exogenous genomes and their quantification results were used in this paper for the analyses of small RNAs of exogenous origin. dtww@ucla.edu. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Morgan, Sarah L; Palagi, Patricia M; Fernandes, Pedro L; Koperlainen, Eija; Dimec, Jure; Marek, Diana; Larcombe, Lee; Rustici, Gabriella; Attwood, Teresa K; Via, Allegra
2017-01-01
One of the main goals of the ELIXIR-EXCELERATE project from the European Union's Horizon 2020 programme is to support a pan-European training programme to increase bioinformatics capacity and competency across ELIXIR Nodes. To this end, a Train-the-Trainer (TtT) programme has been developed by the TtT subtask of EXCELERATE's Training Platform, to try to expose bioinformatics instructors to aspects of pedagogy and evidence-based learning principles, to help them better design, develop and deliver high-quality training in future. As a first step towards such a programme, an ELIXIR-EXCELERATE TtT (EE-TtT) pilot was developed, drawing on existing 'instructor training' models, using input both from experienced instructors and from experts in bioinformatics, the cognitive sciences and educational psychology. This manuscript describes the process of defining the pilot programme, illustrates its goals, structure and contents, and discusses its outcomes. From Jan 2016 to Jan 2017, we carried out seven pilot EE-TtT courses (training more than sixty new instructors), collaboratively drafted the training materials, and started establishing a network of trainers and instructors within the ELIXIR community. The EE-TtT pilot represents an essential step towards the development of a sustainable and scalable ELIXIR TtT programme. Indeed, the lessons learned from the pilot, the experience gained, the materials developed, and the analysis of the feedback collected throughout the seven pilot courses have both positioned us to consolidate the programme in the coming years, and contributed to the development of an enthusiastic and expanding ELIXIR community of instructors and trainers.
Morgan, Sarah L; Koperlainen, Eija; Dimec, Jure; Marek, Diana; Larcombe, Lee; Rustici, Gabriella; Attwood, Teresa K; Via, Allegra
2017-01-01
One of the main goals of the ELIXIR-EXCELERATE project from the European Union’s Horizon 2020 programme is to support a pan-European training programme to increase bioinformatics capacity and competency across ELIXIR Nodes. To this end, a Train-the-Trainer (TtT) programme has been developed by the TtT subtask of EXCELERATE’s Training Platform, to try to expose bioinformatics instructors to aspects of pedagogy and evidence-based learning principles, to help them better design, develop and deliver high-quality training in future. As a first step towards such a programme, an ELIXIR-EXCELERATE TtT (EE-TtT) pilot was developed, drawing on existing ‘instructor training’ models, using input both from experienced instructors and from experts in bioinformatics, the cognitive sciences and educational psychology. This manuscript describes the process of defining the pilot programme, illustrates its goals, structure and contents, and discusses its outcomes. From Jan 2016 to Jan 2017, we carried out seven pilot EE-TtT courses (training more than sixty new instructors), collaboratively drafted the training materials, and started establishing a network of trainers and instructors within the ELIXIR community. The EE-TtT pilot represents an essential step towards the development of a sustainable and scalable ELIXIR TtT programme. Indeed, the lessons learned from the pilot, the experience gained, the materials developed, and the analysis of the feedback collected throughout the seven pilot courses have both positioned us to consolidate the programme in the coming years, and contributed to the development of an enthusiastic and expanding ELIXIR community of instructors and trainers. PMID:28928938
Bioinformatics clouds for big data manipulation
2012-01-01
Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475
Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba
2018-01-19
The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.
Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen
2015-04-15
In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications
2011-01-01
Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework. PMID:21806842
The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications.
Katayama, Toshiaki; Wilkinson, Mark D; Vos, Rutger; Kawashima, Takeshi; Kawashima, Shuichi; Nakao, Mitsuteru; Yamamoto, Yasunori; Chun, Hong-Woo; Yamaguchi, Atsuko; Kawano, Shin; Aerts, Jan; Aoki-Kinoshita, Kiyoko F; Arakawa, Kazuharu; Aranda, Bruno; Bonnal, Raoul Jp; Fernández, José M; Fujisawa, Takatomo; Gordon, Paul Mk; Goto, Naohisa; Haider, Syed; Harris, Todd; Hatakeyama, Takashi; Ho, Isaac; Itoh, Masumi; Kasprzyk, Arek; Kido, Nobuhiro; Kim, Young-Joo; Kinjo, Akira R; Konishi, Fumikazu; Kovarskaya, Yulia; von Kuster, Greg; Labarga, Alberto; Limviphuvadh, Vachiranee; McCarthy, Luke; Nakamura, Yasukazu; Nam, Yunsun; Nishida, Kozo; Nishimura, Kunihiro; Nishizawa, Tatsuya; Ogishima, Soichi; Oinn, Tom; Okamoto, Shinobu; Okuda, Shujiro; Ono, Keiichiro; Oshita, Kazuki; Park, Keun-Joon; Putnam, Nicholas; Senger, Martin; Severin, Jessica; Shigemoto, Yasumasa; Sugawara, Hideaki; Taylor, James; Trelles, Oswaldo; Yamasaki, Chisato; Yamashita, Riu; Satoh, Noriyuki; Takagi, Toshihisa
2011-08-02
The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.
Developing eThread pipeline using SAGA-pilot abstraction for large-scale structural bioinformatics.
Ragothaman, Anjani; Boddu, Sairam Chowdary; Kim, Nayong; Feinstein, Wei; Brylinski, Michal; Jha, Shantenu; Kim, Joohyun
2014-01-01
While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread--a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.
Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics
Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun
2014-01-01
While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285
Fang, Xiang; Li, Ning-qiu; Fu, Xiao-zhe; Li, Kai-bin; Lin, Qiang; Liu, Li-hui; Shi, Cun-bin; Wu, Shu-qin
2015-07-01
As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.
Buying in to bioinformatics: an introduction to commercial sequence analysis software
2015-01-01
Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. PMID:25183247
Buying in to bioinformatics: an introduction to commercial sequence analysis software.
Smith, David Roy
2015-07-01
Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. © The Author 2014. Published by Oxford University Press.
Bioinformatics of prokaryotic RNAs
Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F
2014-01-01
The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880
bioalcidae, samjs and vcffilterjs: object-oriented formatters and filters for bioinformatics files.
Lindenbaum, Pierre; Redon, Richard
2018-04-01
Reformatting and filtering bioinformatics files are common tasks for bioinformaticians. Standard Linux tools and specific programs are usually used to perform such tasks but there is still a gap between using these tools and the programming interface of some existing libraries. In this study, we developed a set of tools namely bioalcidae, samjs and vcffilterjs that reformat or filter files using a JavaScript engine or a pure java expression and taking advantage of the java API for high-throughput sequencing data (htsjdk). https://github.com/lindenb/jvarkit. pierre.lindenbaum@univ-nantes.fr.
Deep learning in bioinformatics.
Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh
2017-09-01
In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Web-based assessment of bioinformatics end-user support services at US universities
Messersmith, Donna J.; Benson, Dennis A.; Geer, Renata C.
2006-01-01
Objectives: This study was conducted to gauge the availability of bioinformatics end-user support services at US universities and to identify the providers of those services. The study primarily focused on the availability of short-term workshops that introduce users to molecular biology databases and analysis software. Methods: Websites of selected US universities were reviewed to determine if bioinformatics educational workshops were offered, and, if so, what organizational units in the universities provided them. Results: Of 239 reviewed universities, 72 (30%) offered bioinformatics educational workshops. These workshops were located at libraries (N = 15), bioinformatics centers (N = 38), or other facilities (N = 35). No such training was noted on the sites of 167 universities (70%). Of the 115 bioinformatics centers identified, two-thirds did not offer workshops. Conclusions: This analysis of university Websites indicates that a gap may exist in the availability of workshops and related training to assist researchers in the use of bioinformatics resources, representing a potential opportunity for libraries and other facilities to provide training and assistance for this growing user group. PMID:16888663
Accessing and Integrating Data and Knowledge for Biomedical Research
Burgun, A.; Bodenreider, O.
2008-01-01
Summary Objectives To review the issues that have arisen with the advent of translational research in terms of integration of data and knowledge, and survey current efforts to address these issues. Methods Using examples form the biomedical literature, we identified new trends in biomedical research and their impact on bioinformatics. We analyzed the requirements for effective knowledge repositories and studied issues in the integration of biomedical knowledge. Results New diagnostic and therapeutic approaches based on gene expression patterns have brought about new issues in the statistical analysis of data, and new workflows are needed are needed to support translational research. Interoperable data repositories based on standard annotations, infrastructures and services are needed to support the pooling and meta-analysis of data, as well as their comparison to earlier experiments. High-quality, integrated ontologies and knowledge bases serve as a source of prior knowledge used in combination with traditional data mining techniques and contribute to the development of more effective data analysis strategies. Conclusion As biomedical research evolves from traditional clinical and biological investigations towards omics sciences and translational research, specific needs have emerged, including integrating data collected in research studies with patient clinical data, linking omics knowledge with medical knowledge, modeling the molecular basis of diseases, and developing tools that support in-depth analysis of research data. As such, translational research illustrates the need to bridge the gap between bioinformatics and medical informatics, and opens new avenues for biomedical informatics research. PMID:18660883
Development and application of an algorithm to compute weighted multiple glycan alignments.
Hosoda, Masae; Akune, Yukie; Aoki-Kinoshita, Kiyoko F
2017-05-01
A glycan consists of monosaccharides linked by glycosidic bonds, has branches and forms complex molecular structures. Databases have been developed to store large amounts of glycan-binding experiments, including glycan arrays with glycan-binding proteins. However, there are few bioinformatics techniques to analyze large amounts of data for glycans because there are few tools that can handle the complexity of glycan structures. Thus, we have developed the MCAW (Multiple Carbohydrate Alignment with Weights) tool that can align multiple glycan structures, to aid in the understanding of their function as binding recognition molecules. We have described in detail the first algorithm to perform multiple glycan alignments by modeling glycans as trees. To test our tool, we prepared several data sets, and as a result, we found that the glycan motif could be successfully aligned without any prior knowledge applied to the tool, and the known recognition binding sites of glycans could be aligned at a high rate amongst all our datasets tested. We thus claim that our tool is able to find meaningful glycan recognition and binding patterns using data obtained by glycan-binding experiments. The development and availability of an effective multiple glycan alignment tool opens possibilities for many other glycoinformatics analysis, making this work a big step towards furthering glycomics analysis. http://www.rings.t.soka.ac.jp. kkiyoko@soka.ac.jp. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
LXtoo: an integrated live Linux distribution for the bioinformatics community
2012-01-01
Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356
LXtoo: an integrated live Linux distribution for the bioinformatics community.
Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu
2012-07-19
Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.
4273π: Bioinformatics education on low cost ARM hardware
2013-01-01
Background Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. Results We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012–2013. Conclusions 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost. PMID:23937194
4273π: bioinformatics education on low cost ARM hardware.
Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D
2013-08-12
Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.
A decade of Web Server updates at the Bioinformatics Links Directory: 2003-2012.
Brazas, Michelle D; Yim, David; Yeung, Winston; Ouellette, B F Francis
2012-07-01
The 2012 Bioinformatics Links Directory update marks the 10th special Web Server issue from Nucleic Acids Research. Beginning with content from their 2003 publication, the Bioinformatics Links Directory in collaboration with Nucleic Acids Research has compiled and published a comprehensive list of freely accessible, online tools, databases and resource materials for the bioinformatics and life science research communities. The past decade has exhibited significant growth and change in the types of tools, databases and resources being put forth, reflecting both technology changes and the nature of research over that time. With the addition of 90 web server tools and 12 updates from the July 2012 Web Server issue of Nucleic Acids Research, the Bioinformatics Links Directory at http://bioinformatics.ca/links_directory/ now contains an impressive 134 resources, 455 databases and 1205 web server tools, mirroring the continued activity and efforts of our field.
BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine.
Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu
2016-02-16
Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM's diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients' target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ's cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the "multi-component, multi-target and multi-pathway" combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM's molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.
Computational Studies of Snake Venom Toxins
Ojeda, Paola G.; Caballero, Julio; Kaas, Quentin; González, Wendy
2017-01-01
Most snake venom toxins are proteins, and participate to envenomation through a diverse array of bioactivities, such as bleeding, inflammation, and pain, cytotoxic, cardiotoxic or neurotoxic effects. The venom of a single snake species contains hundreds of toxins, and the venoms of the 725 species of venomous snakes represent a large pool of potentially bioactive proteins. Despite considerable discovery efforts, most of the snake venom toxins are still uncharacterized. Modern bioinformatics tools have been recently developed to mine snake venoms, helping focus experimental research on the most potentially interesting toxins. Some computational techniques predict toxin molecular targets, and the binding mode to these targets. This review gives an overview of current knowledge on the ~2200 sequences, and more than 400 three-dimensional structures of snake toxins deposited in public repositories, as well as of molecular modeling studies of the interaction between these toxins and their molecular targets. We also describe how modern bioinformatics have been used to study the snake venom protein phospholipase A2, the small basic myotoxin Crotamine, and the three-finger peptide Mambalgin. PMID:29271884
Antimicrobial resistance surveillance in the genomic age.
McArthur, Andrew G; Tsang, Kara K
2017-01-01
The loss of effective antimicrobials is reducing our ability to protect the global population from infectious disease. However, the field of antibiotic drug discovery and the public health monitoring of antimicrobial resistance (AMR) is beginning to exploit the power of genome and metagenome sequencing. The creation of novel AMR bioinformatics tools and databases and their continued development will advance our understanding of the molecular mechanisms and threat severity of antibiotic resistance, while simultaneously improving our ability to accurately predict and screen for antibiotic resistance genes within environmental, agricultural, and clinical settings. To do so, efforts must be focused toward exploiting the advancements of genome sequencing and information technology. Currently, AMR bioinformatics software and databases reflect different scopes and functions, each with its own strengths and weaknesses. A review of the available tools reveals common approaches and reference data but also reveals gaps in our curated data, models, algorithms, and data-sharing tools that must be addressed to conquer the limitations and areas of unmet need within the AMR research field before DNA sequencing can be fully exploited for AMR surveillance and improved clinical outcomes. © 2016 New York Academy of Sciences.
ExPASy: SIB bioinformatics resource portal.
Artimo, Panu; Jonnalagedda, Manohar; Arnold, Konstantin; Baratin, Delphine; Csardi, Gabor; de Castro, Edouard; Duvaud, Séverine; Flegel, Volker; Fortier, Arnaud; Gasteiger, Elisabeth; Grosdidier, Aurélien; Hernandez, Céline; Ioannidis, Vassilios; Kuznetsov, Dmitry; Liechti, Robin; Moretti, Sébastien; Mostaguir, Khaled; Redaschi, Nicole; Rossier, Grégoire; Xenarios, Ioannis; Stockinger, Heinz
2012-07-01
ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a 'decentralized' way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across 'selected' resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy.
RImmPort: an R/Bioconductor package that enables ready-for-analysis immunology research data.
Shankar, Ravi D; Bhattacharya, Sanchita; Jujjavarapu, Chethan; Andorf, Sandra; Wiser, Jeffery A; Butte, Atul J
2017-04-01
: Open access to raw clinical and molecular data related to immunological studies has created a tremendous opportunity for data-driven science. We have developed RImmPort that prepares NIAID-funded research study datasets in ImmPort (immport.org) for analysis in R. RImmPort comprises of three main components: (i) a specification of R classes that encapsulate study data, (ii) foundational methods to load data of a specific study and (iii) generic methods to slice and dice data across different dimensions in one or more studies. Furthermore, RImmPort supports open formalisms, such as CDISC standards on the open source bioinformatics platform Bioconductor, to ensure that ImmPort curated study datasets are seamlessly accessible and ready for analysis, thus enabling innovative bioinformatics research in immunology. RImmPort is available as part of Bioconductor (bioconductor.org/packages/RImmPort). rshankar@stanford.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
2015-01-01
Contextual data collected concurrently with molecular samples are critical to the use of metagenomics in the fields of marine biodiversity, bioinformatics and biotechnology. We present here Marine Microbial Biodiversity, Bioinformatics and Biotechnology (M2B3) standards for “Reporting” and “Serving” data. The M2B3 Reporting Standard (1) describes minimal mandatory and recommended contextual information for a marine microbial sample obtained in the epipelagic zone, (2) includes meaningful information for researchers in the oceanographic, biodiversity and molecular disciplines, and (3) can easily be adopted by any marine laboratory with minimum sampling resources. The M2B3 Service Standard defines a software interface through which these data can be discovered and explored in data repositories. The M2B3 Standards were developed by the European project Micro B3, funded under 7th Framework Programme “Ocean of Tomorrow”, and were first used with the Ocean Sampling Day initiative. We believe that these standards have value in broader marine science. PMID:26203332
BioPig: a Hadoop-based analytic toolkit for large-scale sequence data.
Nordberg, Henrik; Bhatia, Karan; Wang, Kai; Wang, Zhong
2013-12-01
The recent revolution in sequencing technologies has led to an exponential growth of sequence data. As a result, most of the current bioinformatics tools become obsolete as they fail to scale with data. To tackle this 'data deluge', here we introduce the BioPig sequence analysis toolkit as one of the solutions that scale to data and computation. We built BioPig on the Apache's Hadoop MapReduce system and the Pig data flow language. Compared with traditional serial and MPI-based algorithms, BioPig has three major advantages: first, BioPig's programmability greatly reduces development time for parallel bioinformatics applications; second, testing BioPig with up to 500 Gb sequences demonstrates that it scales automatically with size of data; and finally, BioPig can be ported without modification on many Hadoop infrastructures, as tested with Magellan system at National Energy Research Scientific Computing Center and the Amazon Elastic Compute Cloud. In summary, BioPig represents a novel program framework with the potential to greatly accelerate data-intensive bioinformatics analysis.
Prospects and limitations of full-text index structures in genome analysis
Vyverman, Michaël; De Baets, Bernard; Fack, Veerle; Dawyndt, Peter
2012-01-01
The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared. PMID:22584621
Gelbart, Hadas; Ben-Dor, Shifra; Yarden, Anat
2017-01-01
Despite the central place held by bioinformatics in modern life sciences and related areas, it has only recently been integrated to a limited extent into high-school teaching and learning programs. Here we describe the assessment of a learning environment entitled ‘Bioinformatics in the Service of Biotechnology’. Students’ learning outcomes and attitudes toward the bioinformatics learning environment were measured by analyzing their answers to questions embedded within the activities, questionnaires, interviews and observations. Students’ difficulties and knowledge acquisition were characterized based on four categories: the required domain-specific knowledge (declarative, procedural, strategic or situational), the scientific field that each question stems from (biology, bioinformatics or their combination), the associated cognitive-process dimension (remember, understand, apply, analyze, evaluate, create) and the type of question (open-ended or multiple choice). Analysis of students’ cognitive outcomes revealed learning gains in bioinformatics and related scientific fields, as well as appropriation of the bioinformatics approach as part of the students’ scientific ‘toolbox’. For students, questions stemming from the ‘old world’ biology field and requiring declarative or strategic knowledge were harder to deal with. This stands in contrast to their teachers’ prediction. Analysis of students’ affective outcomes revealed positive attitudes toward bioinformatics and the learning environment, as well as their perception of the teacher’s role. Insights from this analysis yielded implications and recommendations for curriculum design, classroom enactment, teacher education and research. For example, we recommend teaching bioinformatics in an integrative and comprehensive manner, through an inquiry process, and linking it to the wider science curriculum. PMID:26801769
Ladics, Gregory S; Cressman, Robert F; Herouet-Guicheney, Corinne; Herman, Rod A; Privalle, Laura; Song, Ping; Ward, Jason M; McClain, Scott
2011-06-01
Bioinformatic tools are being increasingly utilized to evaluate the degree of similarity between a novel protein and known allergens within the context of a larger allergy safety assessment process. Importantly, bioinformatics is not a predictive analysis that can determine if a novel protein will ''become" an allergen, but rather a tool to assess whether the protein is a known allergen or is potentially cross-reactive with an existing allergen. Bioinformatic tools are key components of the 2009 CodexAlimentarius Commission's weight-of-evidence approach, which encompasses a variety of experimental approaches for an overall assessment of the allergenic potential of a novel protein. Bioinformatic search comparisons between novel protein sequences, as well as potential novel fusion sequences derived from the genome and transgene, and known allergens are required by all regulatory agencies that assess the safety of genetically modified (GM) products. The objective of this paper is to identify opportunities for consensus in the methods of applying bioinformatics and to outline differences that impact a consistent and reliable allergy safety assessment. The bioinformatic comparison process has some critical features, which are outlined in this paper. One of them is a curated, publicly available and well-managed database with known allergenic sequences. In this paper, the best practices, scientific value, and food safety implications of bioinformatic analyses, as they are applied to GM food crops are discussed. Recommendations for conducting bioinformatic analysis on novel food proteins for potential cross-reactivity to known allergens are also put forth. Copyright © 2011 Elsevier Inc. All rights reserved.
Machluf, Yossy; Gelbart, Hadas; Ben-Dor, Shifra; Yarden, Anat
2017-01-01
Despite the central place held by bioinformatics in modern life sciences and related areas, it has only recently been integrated to a limited extent into high-school teaching and learning programs. Here we describe the assessment of a learning environment entitled 'Bioinformatics in the Service of Biotechnology'. Students' learning outcomes and attitudes toward the bioinformatics learning environment were measured by analyzing their answers to questions embedded within the activities, questionnaires, interviews and observations. Students' difficulties and knowledge acquisition were characterized based on four categories: the required domain-specific knowledge (declarative, procedural, strategic or situational), the scientific field that each question stems from (biology, bioinformatics or their combination), the associated cognitive-process dimension (remember, understand, apply, analyze, evaluate, create) and the type of question (open-ended or multiple choice). Analysis of students' cognitive outcomes revealed learning gains in bioinformatics and related scientific fields, as well as appropriation of the bioinformatics approach as part of the students' scientific 'toolbox'. For students, questions stemming from the 'old world' biology field and requiring declarative or strategic knowledge were harder to deal with. This stands in contrast to their teachers' prediction. Analysis of students' affective outcomes revealed positive attitudes toward bioinformatics and the learning environment, as well as their perception of the teacher's role. Insights from this analysis yielded implications and recommendations for curriculum design, classroom enactment, teacher education and research. For example, we recommend teaching bioinformatics in an integrative and comprehensive manner, through an inquiry process, and linking it to the wider science curriculum. © The Author 2016. Published by Oxford University Press.
Oulas, Anastasis; Minadakis, George; Zachariou, Margarita; Sokratous, Kleitos; Bourdakou, Marilena M; Spyrou, George M
2017-11-27
Systems Bioinformatics is a relatively new approach, which lies in the intersection of systems biology and classical bioinformatics. It focuses on integrating information across different levels using a bottom-up approach as in systems biology with a data-driven top-down approach as in bioinformatics. The advent of omics technologies has provided the stepping-stone for the emergence of Systems Bioinformatics. These technologies provide a spectrum of information ranging from genomics, transcriptomics and proteomics to epigenomics, pharmacogenomics, metagenomics and metabolomics. Systems Bioinformatics is the framework in which systems approaches are applied to such data, setting the level of resolution as well as the boundary of the system of interest and studying the emerging properties of the system as a whole rather than the sum of the properties derived from the system's individual components. A key approach in Systems Bioinformatics is the construction of multiple networks representing each level of the omics spectrum and their integration in a layered network that exchanges information within and between layers. Here, we provide evidence on how Systems Bioinformatics enhances computational therapeutics and diagnostics, hence paving the way to precision medicine. The aim of this review is to familiarize the reader with the emerging field of Systems Bioinformatics and to provide a comprehensive overview of its current state-of-the-art methods and technologies. Moreover, we provide examples of success stories and case studies that utilize such methods and tools to significantly advance research in the fields of systems biology and systems medicine. © The Author 2017. Published by Oxford University Press.
Zemojtel, Tomasz; Köhler, Sebastian; Mackenroth, Luisa; Jäger, Marten; Hecht, Jochen; Krawitz, Peter; Graul-Neumann, Luitgard; Doelken, Sandra; Ehmke, Nadja; Spielmann, Malte; Øien, Nancy Christine; Schweiger, Michal R.; Krüger, Ulrike; Frommer, Götz; Fischer, Björn; Kornak, Uwe; Flöttmann, Ricarda; Ardeshirdavani, Amin; Moreau, Yves; Lewis, Suzanna E.; Haendel, Melissa; Smedley, Damian; Horn, Denise; Mundlos, Stefan; Robinson, Peter N.
2015-01-01
Less than half of patients with suspected genetic disease receive a molecular diagnosis. We have therefore integrated next-generation sequencing (NGS), bioinformatics, and clinical data into an effective diagnostic workflow. We used variants in the 2741 established Mendelian disease genes [the disease-associated genome (DAG)] to develop a targeted enrichment DAG panel (7.1 Mb), which achieves a coverage of 20-fold or better for 98% of bases. Furthermore, we established a computational method [Phenotypic Interpretation of eXomes (PhenIX)] that evaluated and ranked variants based on pathogenicity and semantic similarity of patients’ phenotype described by Human Phenotype Ontology (HPO) terms to those of 3991 Mendelian diseases. In computer simulations, ranking genes based on the variant score put the true gene in first place less than 5% of the time; PhenIX placed the correct gene in first place more than 86% of the time. In a retrospective test of PhenIX on 52 patients with previously identified mutations and known diagnoses, the correct gene achieved a mean rank of 2.1. In a prospective study on 40 individuals without a diagnosis, PhenIX analysis enabled a diagnosis in 11 cases (28%, at a mean rank of 2.4). Thus, the NGS of the DAG followed by phenotype-driven bioinformatic analysis allows quick and effective differential diagnostics in medical genetics. PMID:25186178
Zemojtel, Tomasz; Köhler, Sebastian; Mackenroth, Luisa; Jäger, Marten; Hecht, Jochen; Krawitz, Peter; Graul-Neumann, Luitgard; Doelken, Sandra; Ehmke, Nadja; Spielmann, Malte; Oien, Nancy Christine; Schweiger, Michal R; Krüger, Ulrike; Frommer, Götz; Fischer, Björn; Kornak, Uwe; Flöttmann, Ricarda; Ardeshirdavani, Amin; Moreau, Yves; Lewis, Suzanna E; Haendel, Melissa; Smedley, Damian; Horn, Denise; Mundlos, Stefan; Robinson, Peter N
2014-09-03
Less than half of patients with suspected genetic disease receive a molecular diagnosis. We have therefore integrated next-generation sequencing (NGS), bioinformatics, and clinical data into an effective diagnostic workflow. We used variants in the 2741 established Mendelian disease genes [the disease-associated genome (DAG)] to develop a targeted enrichment DAG panel (7.1 Mb), which achieves a coverage of 20-fold or better for 98% of bases. Furthermore, we established a computational method [Phenotypic Interpretation of eXomes (PhenIX)] that evaluated and ranked variants based on pathogenicity and semantic similarity of patients' phenotype described by Human Phenotype Ontology (HPO) terms to those of 3991 Mendelian diseases. In computer simulations, ranking genes based on the variant score put the true gene in first place less than 5% of the time; PhenIX placed the correct gene in first place more than 86% of the time. In a retrospective test of PhenIX on 52 patients with previously identified mutations and known diagnoses, the correct gene achieved a mean rank of 2.1. In a prospective study on 40 individuals without a diagnosis, PhenIX analysis enabled a diagnosis in 11 cases (28%, at a mean rank of 2.4). Thus, the NGS of the DAG followed by phenotype-driven bioinformatic analysis allows quick and effective differential diagnostics in medical genetics. Copyright © 2014, American Association for the Advancement of Science.
Classification of mislabelled microarrays using robust sparse logistic regression.
Bootkrajang, Jakramate; Kabán, Ata
2013-04-01
Previous studies reported that labelling errors are not uncommon in microarray datasets. In such cases, the training set may become misleading, and the ability of classifiers to make reliable inferences from the data is compromised. Yet, few methods are currently available in the bioinformatics literature to deal with this problem. The few existing methods focus on data cleansing alone, without reference to classification, and their performance crucially depends on some tuning parameters. In this article, we develop a new method to detect mislabelled arrays simultaneously with learning a sparse logistic regression classifier. Our method may be seen as a label-noise robust extension of the well-known and successful Bayesian logistic regression classifier. To account for possible mislabelling, we formulate a label-flipping process as part of the classifier. The regularization parameter is automatically set using Bayesian regularization, which not only saves the computation time that cross-validation would take, but also eliminates any unwanted effects of label noise when setting the regularization parameter. Extensive experiments with both synthetic data and real microarray datasets demonstrate that our approach is able to counter the bad effects of labelling errors in terms of predictive performance, it is effective at identifying marker genes and simultaneously it detects mislabelled arrays to high accuracy. The code is available from http://cs.bham.ac.uk/∼jxb008. Supplementary data are available at Bioinformatics online.
Interdisciplinary Introductory Course in Bioinformatics
ERIC Educational Resources Information Center
Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.
2010-01-01
Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…
Survey of Natural Language Processing Techniques in Bioinformatics.
Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling
2015-01-01
Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.
Atlas - a data warehouse for integrative bioinformatics.
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire M S; Ling, John; Ouellette, B F Francis
2005-02-21
We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: http://bioinformatics.ubc.ca/atlas/
Chapter 16: text mining for translational bioinformatics.
Cohen, K Bretonnel; Hunter, Lawrence E
2013-04-01
Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.
Robust High-dimensional Bioinformatics Data Streams Mining by ODR-ioVFDT
Wang, Dantong; Fong, Simon; Wong, Raymond K.; Mohammed, Sabah; Fiaidhi, Jinan; Wong, Kelvin K. L.
2017-01-01
Outlier detection in bioinformatics data streaming mining has received significant attention by research communities in recent years. The problems of how to distinguish noise from an exception and deciding whether to discard it or to devise an extra decision path for accommodating it are causing dilemma. In this paper, we propose a novel algorithm called ODR with incrementally Optimized Very Fast Decision Tree (ODR-ioVFDT) for taking care of outliers in the progress of continuous data learning. By using an adaptive interquartile-range based identification method, a tolerance threshold is set. It is then used to judge if a data of exceptional value should be included for training or otherwise. This is different from the traditional outlier detection/removal approaches which are two separate steps in processing through the data. The proposed algorithm is tested using datasets of five bioinformatics scenarios and comparing the performance of our model and other ones without ODR. The results show that ODR-ioVFDT has better performance in classification accuracy, kappa statistics, and time consumption. The ODR-ioVFDT applied onto bioinformatics streaming data processing for detecting and quantifying the information of life phenomena, states, characters, variables and components of the organism can help to diagnose and treat disease more effectively. PMID:28230161
Precision medicine needs pioneering clinical bioinformaticians.
Gómez-López, Gonzalo; Dopazo, Joaquín; Cigudosa, Juan C; Valencia, Alfonso; Al-Shahrour, Fátima
2017-10-25
Success in precision medicine depends on accessing high-quality genetic and molecular data from large, well-annotated patient cohorts that couple biological samples to comprehensive clinical data, which in conjunction can lead to effective therapies. From such a scenario emerges the need for a new professional profile, an expert bioinformatician with training in clinical areas who can make sense of multi-omics data to improve therapeutic interventions in patients, and the design of optimized basket trials. In this review, we first describe the main policies and international initiatives that focus on precision medicine. Secondly, we review the currently ongoing clinical trials in precision medicine, introducing the concept of 'precision bioinformatics', and we describe current pioneering bioinformatics efforts aimed at implementing tools and computational infrastructures for precision medicine in health institutions around the world. Thirdly, we discuss the challenges related to the clinical training of bioinformaticians, and the urgent need for computational specialists capable of assimilating medical terminologies and protocols to address real clinical questions. We also propose some skills required to carry out common tasks in clinical bioinformatics and some tips for emergent groups. Finally, we explore the future perspectives and the challenges faced by precision medicine bioinformatics. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Can all heritable biology really be reduced to a single dimension?
Babbitt, Gregory A; Coppola, Erin E; Alawad, Mohammed A; Hudson, André O
2016-03-10
A long-held presupposition in the field of bioinformatics holds that genetic, and now even epigenetic 'information' can be abstracted from the physicochemical details of the macromolecular polymers in which it resides. It is perhaps rather ironic that this basic conjecture originated upon the first observations of DNA structure itself. This static model of DNA led very quickly to the conclusion that only the nucleobase sequence itself is rich enough in molecular complexity to replicate a complex biology. This idea has been pervasive throughout genomic science, higher education and popular culture ever since; to the point that most of us would accept it unquestioningly as fact. What is more alarming is that this conjecture is driving a significant portion of the technological development in modern genomics towards methods strongly rooted in DNA sequencing, thereby reducing a dynamic multi-dimensional biology into single-dimensional forms of data. Evidence countering this central tenet of bioinformatics has been quietly mounting over many decades, prompting some to propose that the genome must be studied from the perspective of its molecular reality, rather than as a body of information to be represented symbolically. Here, we explore the epistemological boundary between bioinformatics and molecular biology, and warn against an 'overtly' bioinformatic perspective. We review a selection of new bioinformatic methods that move beyond sequence-based approaches to include consideration of databased three dimensional structures. However, we also note that these hybrid methods still ignore the most important element of gene function when attempting to improve outcomes; the fourth dimension of molecular dynamics over time. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
ballaxy: web services for structural bioinformatics.
Hildebrandt, Anna Katharina; Stöckel, Daniel; Fischer, Nina M; de la Garza, Luis; Krüger, Jens; Nickels, Stefan; Röttig, Marc; Schärfe, Charlotta; Schumann, Marcel; Thiel, Philipp; Lenhof, Hans-Peter; Kohlbacher, Oliver; Hildebrandt, Andreas
2015-01-01
Web-based workflow systems have gained considerable momentum in sequence-oriented bioinformatics. In structural bioinformatics, however, such systems are still relatively rare; while commercial stand-alone workflow applications are common in the pharmaceutical industry, academic researchers often still rely on command-line scripting to glue individual tools together. In this work, we address the problem of building a web-based system for workflows in structural bioinformatics. For the underlying molecular modelling engine, we opted for the BALL framework because of its extensive and well-tested functionality in the field of structural bioinformatics. The large number of molecular data structures and algorithms implemented in BALL allows for elegant and sophisticated development of new approaches in the field. We hence connected the versatile BALL library and its visualization and editing front end BALLView with the Galaxy workflow framework. The result, which we call ballaxy, enables the user to simply and intuitively create sophisticated pipelines for applications in structure-based computational biology, integrated into a standard tool for molecular modelling. ballaxy consists of three parts: some minor modifications to the Galaxy system, a collection of tools and an integration into the BALL framework and the BALLView application for molecular modelling. Modifications to Galaxy will be submitted to the Galaxy project, and the BALL and BALLView integrations will be integrated in the next major BALL release. After acceptance of the modifications into the Galaxy project, we will publish all ballaxy tools via the Galaxy toolshed. In the meantime, all three components are available from http://www.ball-project.org/ballaxy. Also, docker images for ballaxy are available at https://registry.hub.docker.com/u/anhi/ballaxy/dockerfile/. ballaxy is licensed under the terms of the GPL. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
[Application of bioinformatics in researches of industrial biocatalysis].
Yu, Hui-Min; Luo, Hui; Shi, Yue; Sun, Xu-Dong; Shen, Zhong-Yao
2004-05-01
Industrial biocatalysis is currently attracting much attention to rebuild or substitute traditional producing process of chemicals and drugs. One of key focuses in industrial biocatalysis is biocatalyst, which is usually one kind of microbial enzyme. In the recent, new technologies of bioinformatics have played and will continue to play more and more significant roles in researches of industrial biocatalysis in response to the waves of genomic revolution. One of the key applications of bioinformatics in biocatalysis is the discovery and identification of the new biocatalyst through advanced DNA and protein sequence search, comparison and analyses in Internet database using different algorithm and software. The unknown genes of microbial enzymes can also be simply harvested by primer design on the basis of bioinformatics analyses. The other key applications of bioinformatics in biocatalysis are the modification and improvement of existing industrial biocatalyst. In this aspect, bioinformatics is of great importance in both rational design and directed evolution of microbial enzymes. Based on the successful prediction of tertiary structures of enzymes using the tool of bioinformatics, the undermentioned experiments, i.e. site-directed mutagenesis, fusion protein construction, DNA family shuffling and saturation mutagenesis, etc, are usually of very high efficiency. On all accounts, bioinformatics will be an essential tool for either biologist or biological engineer in the future researches of industrial biocatalysis, due to its significant function in guiding and quickening the step of discovery and/or improvement of novel biocatalysts.
Thiel, William H.; Bair, Thomas; Peek, Andrew S.; Liu, Xiuying; Dassie, Justin; Stockdale, Katie R.; Behlke, Mark A.; Miller, Francis J.; Giangrande, Paloma H.
2012-01-01
Background The broad applicability of RNA aptamers as cell-specific delivery tools for therapeutic reagents depends on the ability to identify aptamer sequences that selectively access the cytoplasm of distinct cell types. Towards this end, we have developed a novel approach that combines a cell-based selection method (cell-internalization SELEX) with high-throughput sequencing (HTS) and bioinformatics analyses to rapidly identify cell-specific, internalization-competent RNA aptamers. Methodology/Principal Findings We demonstrate the utility of this approach by enriching for RNA aptamers capable of selective internalization into vascular smooth muscle cells (VSMCs). Several rounds of positive (VSMCs) and negative (endothelial cells; ECs) selection were performed to enrich for aptamer sequences that preferentially internalize into VSMCs. To identify candidate RNA aptamer sequences, HTS data from each round of selection were analyzed using bioinformatics methods: (1) metrics of selection enrichment; and (2) pairwise comparisons of sequence and structural similarity, termed edit and tree distance, respectively. Correlation analyses of experimentally validated aptamers or rounds revealed that the best cell-specific, internalizing aptamers are enriched as a result of the negative selection step performed against ECs. Conclusions and Significance We describe a novel approach that combines cell-internalization SELEX with HTS and bioinformatics analysis to identify cell-specific, cell-internalizing RNA aptamers. Our data highlight the importance of performing a pre-clear step against a non-target cell in order to select for cell-specific aptamers. We expect the extended use of this approach to enable the identification of aptamers to a multitude of different cell types, thereby facilitating the broad development of targeted cell therapies. PMID:22962591
Douville, Christopher; Masica, David L.; Stenson, Peter D.; Cooper, David N.; Gygax, Derek M.; Kim, Rick; Ryan, Michael
2015-01-01
ABSTRACT Insertion/deletion variants (indels) alter protein sequence and length, yet are highly prevalent in healthy populations, presenting a challenge to bioinformatics classifiers. Commonly used features—DNA and protein sequence conservation, indel length, and occurrence in repeat regions—are useful for inference of protein damage. However, these features can cause false positives when predicting the impact of indels on disease. Existing methods for indel classification suffer from low specificities, severely limiting clinical utility. Here, we further develop our variant effect scoring tool (VEST) to include the classification of in‐frame and frameshift indels (VEST‐indel) as pathogenic or benign. We apply 24 features, including a new “PubMed” feature, to estimate a gene's importance in human disease. When compared with four existing indel classifiers, our method achieves a drastically reduced false‐positive rate, improving specificity by as much as 90%. This approach of estimating gene importance might be generally applicable to missense and other bioinformatics pathogenicity predictors, which often fail to achieve high specificity. Finally, we tested all possible meta‐predictors that can be obtained from combining the four different indel classifiers using Boolean conjunctions and disjunctions, and derived a meta‐predictor with improved performance over any individual method. PMID:26442818
Douville, Christopher; Masica, David L; Stenson, Peter D; Cooper, David N; Gygax, Derek M; Kim, Rick; Ryan, Michael; Karchin, Rachel
2016-01-01
Insertion/deletion variants (indels) alter protein sequence and length, yet are highly prevalent in healthy populations, presenting a challenge to bioinformatics classifiers. Commonly used features--DNA and protein sequence conservation, indel length, and occurrence in repeat regions--are useful for inference of protein damage. However, these features can cause false positives when predicting the impact of indels on disease. Existing methods for indel classification suffer from low specificities, severely limiting clinical utility. Here, we further develop our variant effect scoring tool (VEST) to include the classification of in-frame and frameshift indels (VEST-indel) as pathogenic or benign. We apply 24 features, including a new "PubMed" feature, to estimate a gene's importance in human disease. When compared with four existing indel classifiers, our method achieves a drastically reduced false-positive rate, improving specificity by as much as 90%. This approach of estimating gene importance might be generally applicable to missense and other bioinformatics pathogenicity predictors, which often fail to achieve high specificity. Finally, we tested all possible meta-predictors that can be obtained from combining the four different indel classifiers using Boolean conjunctions and disjunctions, and derived a meta-predictor with improved performance over any individual method. © 2015 The Authors. **Human Mutation published by Wiley Periodicals, Inc.
Moore, Jason H
2007-11-01
Bioinformatics is an interdisciplinary field that blends computer science and biostatistics with biological and biomedical sciences such as biochemistry, cell biology, developmental biology, genetics, genomics, and physiology. An important goal of bioinformatics is to facilitate the management, analysis, and interpretation of data from biological experiments and observational studies. The goal of this review is to introduce some of the important concepts in bioinformatics that must be considered when planning and executing a modern biological research study. We review database resources as well as data mining software tools.
2012-01-01
Background Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. Results In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Conclusions Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org. PMID:23281941
El-Kalioby, Mohamed; Abouelhoda, Mohamed; Krüger, Jan; Giegerich, Robert; Sczyrba, Alexander; Wall, Dennis P; Tonellato, Peter
2012-01-01
Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org.
The making of the Women in Biology forum (WiB) at Bioclues.
Singhania, Reeta Rani; Madduru, Dhatri; Pappu, Pranathi; Panchangam, Sameera; Suravajhala, Renuka; Chandrasekharan, Mohanalatha
2014-01-01
The Women in Biology forum (WiB) of Bioclues (India) began in 2009 to promote and support women pursuing careers in bioinformatics and computational biology. WiB was formed in order to help women scientists deprived of basic research, boost the prominence of women scientists particularly from developing countries, and bridge the gender gap to innovation. WiB has also served as a platform to highlight the work of established female scientists in these fields. Several award-winning women researchers have shared their experiences and provided valuable suggestions to WiB. Headed by Mohanalatha Chandrasekharan and supported by Dr. Reeta Rani Singhania and Renuka Suravajhala, WiB has seen major progress in the last couple of years particularly in the two avenues Mentoring and Research, off the four avenues in Bioclues: Mentoring, Outreach, Research and Entrepreneurship (MORE). In line with the Bioclues vision for bioinformatics in India, the WiB Journal Club (JoC) recognizes women scientists working on functional genomics and bioinformatics, and provides scientific mentorship and support for project design and hypothesis formulation. As a part of Bioclues, WiB members practice the group's open-desk policy and its belief that all members are free to express their own thoughts and opinions. The WiB forum appreciates suggestions and welcomes scientists from around the world to be a part of their mission to encourage women to pursue computational biology and bioinformatics.
BioQueue: a novel pipeline framework to accelerate bioinformatics analysis.
Yao, Li; Wang, Heming; Song, Yuanyuan; Sui, Guangchao
2017-10-15
With the rapid development of Next-Generation Sequencing, a large amount of data is now available for bioinformatics research. Meanwhile, the presence of many pipeline frameworks makes it possible to analyse these data. However, these tools concentrate mainly on their syntax and design paradigms, and dispatch jobs based on users' experience about the resources needed by the execution of a certain step in a protocol. As a result, it is difficult for these tools to maximize the potential of computing resources, and avoid errors caused by overload, such as memory overflow. Here, we have developed BioQueue, a web-based framework that contains a checkpoint before each step to automatically estimate the system resources (CPU, memory and disk) needed by the step and then dispatch jobs accordingly. BioQueue possesses a shell command-like syntax instead of implementing a new script language, which means most biologists without computer programming background can access the efficient queue system with ease. BioQueue is freely available at https://github.com/liyao001/BioQueue. The extensive documentation can be found at http://bioqueue.readthedocs.io. li_yao@outlook.com or gcsui@nefu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Big data for big questions: it is time for data analysts to act
Moscato, Pablo
2015-01-01
Pablo Moscato speaks to Francesca Lake, Managing Editor Australian Research Council Future Fellow Prof. Pablo Moscato was born in 1964 in La Plata, Argentina. Obtaining his B.Sc. in Physics at University of La Plata, his PhD was defended at UNICAMP, Brazil. While at the California Institute of Technology Concurrent Computation Program he developed, in collaboration with Michael Norman, the first application of a methodology later called ‘memetic algorithms’, which is now widely used internationally. He is the founding co-director of the Priority Research Centre for Bioinformatics, Biomarker Discovery and Information-based Medicine (CIBM) (2006–present) and the funding director of the Newcastle Bioinformatics Initiative (2002–2006) of The University of Newcastle (Australia). He is also Chief Investigator of the Australian Research Council Centre in Bioinformatics. He is one of Australia's most cited computer scientists. Over the past 7 years, he has introduced a unifying hallmark of cancer progression based on the changes of information theory quantifiers, and developed a novel mathematical model and an associated solution procedure based on combinatorial optimization techniques to identify drug combinations for cancer therapeutics. In addition, he has identified proteomic signatures to predict the clinical symptoms of Alzheimer's disease, among other ‘firsts’. He is a member of the Editorial Board of Future Science OA. PMID:28031895
Two interactive Bioinformatics courses at the Bielefeld University Bioinformatics Server.
Sczyrba, Alexander; Konermann, Susanne; Giegerich, Robert
2008-05-01
Conferences in computational biology continue to provide tutorials on classical and new methods in the field. This can be taken as an indicator that education is still a bottleneck in our field's process of becoming an established scientific discipline. Bielefeld University has been one of the early providers of bioinformatics education, both locally and via the internet. The Bielefeld Bioinformatics Server (BiBiServ) offers a variety of older and new materials. Here, we report on two online courses made available recently, one introductory and one on the advanced level: (i) SADR: Sequence Analysis with Distributed Resources (http://bibiserv.techfak.uni-bielefeld.de/sadr/) and (ii) ADP: Algebraic Dynamic Programming in Bioinformatics (http://bibiserv.techfak.uni-bielefeld.de/dpcourse/).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S; Jaing, C
The goal of this project is to develop forensic genotyping assays for select agent viruses, addressing a significant capability gap for the viral bioforensics and law enforcement community. We used a multipronged approach combining bioinformatics analysis, PCR-enriched samples, microarrays and TaqMan assays to develop high resolution and cost effective genotyping methods for strain level forensic discrimination of viruses. We have leveraged substantial experience and efficiency gained through year 1 on software development, SNP discovery, TaqMan signature design and phylogenetic signature mapping to scale up the development of forensics signatures in year 2. In this report, we have summarized the Taqmanmore » signature development for South American hemorrhagic fever viruses, tick-borne encephalitis viruses and henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus and Japanese encephalitis virus.« less
Alternative mRNA polyadenylation in eukaryotes: an effective regulator of gene expression
Lutz, Carol S.; Moreira, Alexandra
2010-01-01
Alternative RNA processing mechanisms, including alternative splicing and alternative polyadenylation, are increasingly recognized as important regulators of gene expression. This article will focus on what has recently been described about alternative polyadenylation in development, differentiation, and disease in higher eukaryotes. We will also describe how the evolving global methodologies for examining the cellular transcriptome, both experimental and bioinformatic, are revealing new details about the complex nature of alternative 3′ end formation, as well as interactions with other RNA-mediated and RNA processing mechanisms. PMID:21278855
Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics
ERIC Educational Resources Information Center
Zhang, Xiaorong
2009-01-01
This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…
Stocker, Gernot; Rieder, Dietmar; Trajanoski, Zlatko
2004-03-22
ClusterControl is a web interface to simplify distributing and monitoring bioinformatics applications on Linux cluster systems. We have developed a modular concept that enables integration of command line oriented program into the application framework of ClusterControl. The systems facilitate integration of different applications accessed through one interface and executed on a distributed cluster system. The package is based on freely available technologies like Apache as web server, PHP as server-side scripting language and OpenPBS as queuing system and is available free of charge for academic and non-profit institutions. http://genome.tugraz.at/Software/ClusterControl
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, C
2009-11-12
In FY09 they will (1) complete the implementation, verification, calibration, and sensitivity and scalability analysis of the in-cell virus replication model; (2) complete the design of the cell culture (cell-to-cell infection) model; (3) continue the research, design, and development of their bioinformatics tools: the Web-based structure-alignment-based sequence variability tool and the functional annotation of the genome database; (4) collaborate with the University of California at San Francisco on areas of common interest; and (5) submit journal articles that describe the in-cell model with simulations and the bioinformatics approaches to evaluation of genome variability and fitness.
Detecting circular RNAs: bioinformatic and experimental challenges
Szabo, Linda; Salzman, Julia
2017-01-01
The pervasive expression of circular RNAs (circRNAs) is a recently discovered feature of gene expression in highly diverged eukaryotes. Numerous algorithms that are used to detect genome-wide circRNA expression from RNA sequencing (RNA-seq) data have been developed in the past few years, but there is little overlap in their predictions and no clear gold-standard method to assess the accuracy of these algorithms. We review sources of experimental and bioinformatic biases that complicate the accurate discovery of circRNAs and discuss statistical approaches to address these biases. We conclude with a discussion of the current experimental progress on the topic. PMID:27739534
Wren, Jonathan D; Dozmorov, Mikhail G; Burian, Dennis; Kaundal, Rakesh; Perkins, Andy; Perkins, Ed; Kupfer, Doris M; Springer, Gordon K
2013-01-01
The tenth annual conference of the MidSouth Computational Biology and Bioinformatics Society (MCBIOS 2013), "The 10th Anniversary in a Decade of Change: Discovery in a Sea of Data", took place at the Stoney Creek Inn & Conference Center in Columbia, Missouri on April 5-6, 2013. This year's Conference Chairs were Gordon Springer and Chi-Ren Shyu from the University of Missouri and Edward Perkins from the US Army Corps of Engineers Engineering Research and Development Center, who is also the current MCBIOS President (2012-3). There were 151 registrants and a total of 111 abstracts (51 oral presentations and 60 poster session abstracts).
Gerstein, Mark; Greenbaum, Dov; Cheung, Kei; Miller, Perry L
2007-02-01
Computational biology and bioinformatics (CBB), the terms often used interchangeably, represent a rapidly evolving biological discipline. With the clear potential for discovery and innovation, and the need to deal with the deluge of biological data, many academic institutions are committing significant resources to develop CBB research and training programs. Yale formally established an interdepartmental Ph.D. program in CBB in May 2003. This paper describes Yale's program, discussing the scope of the field, the program's goals and curriculum, as well as a number of issues that arose in implementing the program. (Further updated information is available from the program's website, www.cbb.yale.edu.)
CucCAP - Developing genomic resources for the cucurbit community
USDA-ARS?s Scientific Manuscript database
The U.S. cucurbit community has initiated a USDA-SCRI funded cucurbit genomics project, CucCAP: Leveraging applied genomics to increase disease resistance in cucurbit crops. Our primary objectives are: develop genomic and bioinformatic breeding tool kits for accelerated crop improvement across the...
An overview of bioinformatics tools for epitope prediction: implications on vaccine development.
Soria-Guerra, Ruth E; Nieto-Gomez, Ricardo; Govea-Alonso, Dania O; Rosales-Mendoza, Sergio
2015-02-01
Exploitation of recombinant DNA and sequencing technologies has led to a new concept in vaccination in which isolated epitopes, capable of stimulating a specific immune response, have been identified and used to achieve advanced vaccine formulations; replacing those constituted by whole pathogen-formulations. In this context, bioinformatics approaches play a critical role on analyzing multiple genomes to select the protective epitopes in silico. It is conceived that cocktails of defined epitopes or chimeric protein arrangements, including the target epitopes, may provide a rationale design capable to elicit convenient humoral or cellular immune responses. This review presents a comprehensive compilation of the most advantageous online immunological software and searchable, in order to facilitate the design and development of vaccines. An outlook on how these tools are supporting vaccine development is presented. HIV and influenza have been taken as examples of promising developments on vaccination against hypervariable viruses. Perspectives in this field are also envisioned. Copyright © 2014 Elsevier Inc. All rights reserved.
bioNerDS: exploring bioinformatics’ database and software use through literature mining
2013-01-01
Background Biology-focused databases and software define bioinformatics and their use is central to computational biology. In such a complex and dynamic field, it is of interest to understand what resources are available, which are used, how much they are used, and for what they are used. While scholarly literature surveys can provide some insights, large-scale computer-based approaches to identify mentions of bioinformatics databases and software from primary literature would automate systematic cataloguing, facilitate the monitoring of usage, and provide the foundations for the recovery of computational methods for analysing biological data, with the long-term aim of identifying best/common practice in different areas of biology. Results We have developed bioNerDS, a named entity recogniser for the recovery of bioinformatics databases and software from primary literature. We identify such entities with an F-measure ranging from 63% to 91% at the mention level and 63-78% at the document level, depending on corpus. Not attaining a higher F-measure is mostly due to high ambiguity in resource naming, which is compounded by the on-going introduction of new resources. To demonstrate the software, we applied bioNerDS to full-text articles from BMC Bioinformatics and Genome Biology. General mention patterns reflect the remit of these journals, highlighting BMC Bioinformatics’s emphasis on new tools and Genome Biology’s greater emphasis on data analysis. The data also illustrates some shifts in resource usage: for example, the past decade has seen R and the Gene Ontology join BLAST and GenBank as the main components in bioinformatics processing. Abstract Conclusions We demonstrate the feasibility of automatically identifying resource names on a large-scale from the scientific literature and show that the generated data can be used for exploration of bioinformatics database and software usage. For example, our results help to investigate the rate of change in resource usage and corroborate the suspicion that a vast majority of resources are created, but rarely (if ever) used thereafter. bioNerDS is available at http://bionerds.sourceforge.net/. PMID:23768135
Online Bioinformatics Tutorials | Office of Cancer Genomics
Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.
Treetrimmer: a method for phylogenetic dataset size reduction.
Maruyama, Shinichiro; Eveleigh, Robert J M; Archibald, John M
2013-04-12
With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual 'pruning' of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. Here we present 'TreeTrimmer', a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined 'redundant' sequences, e.g., orthologous sequences from closely related organisms and 'recently' evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion.
Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep
2011-01-01
Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another. PMID:23554714
Comprehensive decision tree models in bioinformatics.
Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter
2012-01-01
Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics.
Comprehensive Decision Tree Models in Bioinformatics
Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter
2012-01-01
Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics. PMID:22479449
Effects-based monitoring (EBM) has been employed as a complement to chemical monitoring to help address knowledge gaps between chemical occurrence and biological effects. We have piloted several pathway-based approaches to EBM, that utilize modern bioinformatic and high throughpu...
A Portable Bioinformatics Course for Upper-Division Undergraduate Curriculum in Sciences
ERIC Educational Resources Information Center
Floraino, Wely B.
2008-01-01
This article discusses the challenges that bioinformatics education is facing and describes a bioinformatics course that is successfully taught at the California State Polytechnic University, Pomona, to the fourth year undergraduate students in biological sciences, chemistry, and computer science. Information on lecture and computer practice…
Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course
ERIC Educational Resources Information Center
Weisman, David
2010-01-01
Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…
A Mathematical Optimization Problem in Bioinformatics
ERIC Educational Resources Information Center
Heyer, Laurie J.
2008-01-01
This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…
Biology in 'silico': The Bioinformatics Revolution.
ERIC Educational Resources Information Center
Bloom, Mark
2001-01-01
Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…
Virtual Bioinformatics Distance Learning Suite
ERIC Educational Resources Information Center
Tolvanen, Martti; Vihinen, Mauno
2004-01-01
Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…
Assessment of a Bioinformatics across Life Science Curricula Initiative
ERIC Educational Resources Information Center
Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.
2007-01-01
At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…
Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics
ERIC Educational Resources Information Center
Likic, Vladimir A.
2006-01-01
This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…
Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools
ERIC Educational Resources Information Center
Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson
2010-01-01
This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…
Stephan, Christian; Hamacher, Michael; Blüggel, Martin; Körting, Gerhard; Chamrad, Daniel; Scheer, Christian; Marcus, Katrin; Reidegeld, Kai A; Lohaus, Christiane; Schäfer, Heike; Martens, Lennart; Jones, Philip; Müller, Michael; Auyeung, Kevin; Taylor, Chris; Binz, Pierre-Alain; Thiele, Herbert; Parkinson, David; Meyer, Helmut E; Apweiler, Rolf
2005-09-01
The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.
Park, Hyun-Seok
2012-12-01
Whereas a vast amount of new information on bioinformatics is made available to the public through patents, only a small set of patents are cited in academic papers. A detailed analysis of registered bioinformatics patents, using the existing patent search system, can provide valuable information links between science and technology. However, it is extremely difficult to select keywords to capture bioinformatics patents, reflecting the convergence of several underlying technologies. No single word or even several words are sufficient to identify such patents. The analysis of patent subclasses can provide valuable information. In this paper, I did a preliminary study of the current status of bioinformatics patents and their International Patent Classification (IPC) groups registered in the Korea Intellectual Property Rights Information Service (KIPRIS) database.
FCDD: A Database for Fruit Crops Diseases.
Chauhan, Rupal; Jasrai, Yogesh; Pandya, Himanshu; Chaudhari, Suman; Samota, Chand Mal
2014-01-01
Fruit Crops Diseases Database (FCDD) requires a number of biotechnology and bioinformatics tools. The FCDD is a unique bioinformatics resource that compiles information about 162 details on fruit crops diseases, diseases type, its causal organism, images, symptoms and their control. The FCDD contains 171 phytochemicals from 25 fruits, their 2D images and their 20 possible sequences. This information has been manually extracted and manually verified from numerous sources, including other electronic databases, textbooks and scientific journals. FCDD is fully searchable and supports extensive text search. The main focus of the FCDD is on providing possible information of fruit crops diseases, which will help in discovery of potential drugs from one of the common bioresource-fruits. The database was developed using MySQL. The database interface is developed in PHP, HTML and JAVA. FCDD is freely available. http://www.fruitcropsdd.com/
Five critical elements to ensure the precision medicine.
Chen, Chengshui; He, Mingyan; Zhu, Yichun; Shi, Lin; Wang, Xiangdong
2015-06-01
The precision medicine as a new emerging area and therapeutic strategy has occurred and was practiced in the individual and brought unexpected successes, and gained high attentions from professional and social aspects as a new path to improve the treatment and prognosis of patients. There will be a number of new components to appear or be discovered, of which clinical bioinformatics integrates clinical phenotypes and informatics with bioinformatics, computational science, mathematics, and systems biology. In addition to those tools, precision medicine calls more accurate and repeatable methodologies for the identification and validation of gene discovery. Precision medicine will bring more new therapeutic strategies, drug discovery and development, and gene-oriented treatment. There is an urgent need to identify and validate disease-specific, mechanism-based, or epigenetics-dependent biomarkers to monitor precision medicine, and develop "precision" regulations to guard the application of precision medicine.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-08-25
Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-01-01
Background Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. Results We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. Conclusion dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms. PMID:19706156
Bioinformatics in High School Biology Curricula: A Study of State Science Standards
ERIC Educational Resources Information Center
Wefer, Stephen H.; Sheppard, Keith
2008-01-01
The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics…
ERIC Educational Resources Information Center
Zhang, Xiaorong
2011-01-01
We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…
ERIC Educational Resources Information Center
Vincent, Antony T.; Bourbonnais, Yves; Brouard, Jean-Simon; Deveau, Hélène; Droit, Arnaud; Gagné, Stéphane M.; Guertin, Michel; Lemieux, Claude; Rathier, Louis; Charette, Steve J.; Lagüe, Patrick
2018-01-01
A recent scientific discipline, bioinformatics, defined as using informatics for the study of biological problems, is now a requirement for the study of biological sciences. Bioinformatics has become such a powerful and popular discipline that several academic institutions have created programs in this field, allowing students to become…
ERIC Educational Resources Information Center
Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari
2014-01-01
Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…
Vignettes: diverse library staff offering diverse bioinformatics services*
Osterbur, David L.; Alpi, Kristine; Canevari, Catharine; Corley, Pamela M.; Devare, Medha; Gaedeke, Nicola; Jacobs, Donna K.; Kirlew, Peter; Ohles, Janet A.; Vaughan, K.T.L.; Wang, Lili; Wu, Yongchun; Geer, Renata C.
2006-01-01
Objectives: The paper gives examples of the bioinformatics services provided in a variety of different libraries by librarians with a broad range of educational background and training. Methods: Two investigators sent an email inquiry to attendees of the “National Center for Biotechnology Information's (NCBI) Introduction to Molecular Biology Information Resources” or “NCBI Advanced Workshop for Bioinformatics Information Specialists (NAWBIS)” courses. The thirty-five-item questionnaire addressed areas such as educational background, library setting, types and numbers of users served, and bioinformatics training and support services provided. Answers were compiled into program vignettes. Discussion: The bioinformatics support services addressed in the paper are based in libraries with academic and clinical settings. Services have been established through different means: in collaboration with biology faculty as part of formal courses, through teaching workshops in the library, through one-on-one consultations, and by other methods. Librarians with backgrounds from art history to doctoral degrees in genetics have worked to establish these programs. Conclusion: Successful bioinformatics support programs can be established in libraries in a variety of different settings and by staff with a variety of different backgrounds and approaches. PMID:16888664
Generalized Centroid Estimators in Bioinformatics
Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi
2011-01-01
In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017
Chen, Yi-Bu; Chattopadhyay, Ansuman; Bergen, Phillip; Gadd, Cynthia; Tannery, Nancy
2007-01-01
To bridge the gap between the rising information needs of biological and medical researchers and the rapidly growing number of online bioinformatics resources, we have created the Online Bioinformatics Resources Collection (OBRC) at the Health Sciences Library System (HSLS) at the University of Pittsburgh. The OBRC, containing 1542 major online bioinformatics databases and software tools, was constructed using the HSLS content management system built on the Zope Web application server. To enhance the output of search results, we further implemented the Vivísimo Clustering Engine, which automatically organizes the search results into categories created dynamically based on the textual information of the retrieved records. As the largest online collection of its kind and the only one with advanced search results clustering, OBRC is aimed at becoming a one-stop guided information gateway to the major bioinformatics databases and software tools on the Web. OBRC is available at the University of Pittsburgh's HSLS Web site (http://www.hsls.pitt.edu/guides/genetics/obrc).
Scalable computing for evolutionary genomics.
Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert
2012-01-01
Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project, BioNode encourages creating free and open source VM images, for multiple targets, through one central project. BioNode can be deployed on Windows, OSX, Linux, and in the Cloud. Next to the downloadable BioNode images, we provide tutorials online, which empower bioinformaticians to install and run BioNode in different environments, as well as information for future initiatives, on creating and building such images.
Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics
Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas
2014-01-01
Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727
An overview of bioinformatics methods for modeling biological pathways in yeast
Hou, Jie; Acharya, Lipi; Zhu, Dongxiao
2016-01-01
The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein–protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae. In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways in S. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed. PMID:26476430
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick; Lo, Chien-Chi; Li, Po-E
EDGE bioinformatics was developed to help biologists process Next Generation Sequencing data (in the form of raw FASTQ files), even if they have little to no bioinformatics expertise. EDGE is a highly integrated and interactive web-based platform that is capable of running many of the standard analyses that biologists require for viral, bacterial/archaeal, and metagenomic samples. EDGE provides the following analytical workflows: quality trimming and host removal, assembly and annotation, comparisons against known references, taxonomy classification of reads and contigs, whole genome SNP-based phylogenetic analysis, and PCR analysis. EDGE provides an intuitive web-based interface for user input, allows users tomore » visualize and interact with selected results (e.g. JBrowse genome browser), and generates a final detailed PDF report. Results in the form of tables, text files, graphic files, and PDFs can be downloaded. A user management system allows tracking of an individual’s EDGE runs, along with the ability to share, post publicly, delete, or archive their results.« less
Genome-wide screening and identification of antigens for rickettsial vaccine development
USDA-ARS?s Scientific Manuscript database
The capacity to identify immunogens for vaccine development by genome-wide screening has been markedly enhanced by the availability of complete microbial genome sequences coupled to rapid proteomic and bioinformatic analysis. Critical to this genome-wide screening is in vivo testing in the context o...
Semester-Long Inquiry-Based Molecular Biology Laboratory: Transcriptional Regulation in Yeast
ERIC Educational Resources Information Center
Oelkers, Peter M.
2017-01-01
A single semester molecular biology laboratory has been developed in which students design and execute a project examining transcriptional regulation in "Saccharomyces cerevisiae." Three weeks of planning are allocated to developing a hypothesis through literature searches and use of bioinformatics. Common experimental plans address a…
BioShaDock: a community driven bioinformatics shared Docker-based tools registry
Moreews, François; Sallou, Olivier; Ménager, Hervé; Le bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier
2015-01-01
Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community. PMID:26913191
BioShaDock: a community driven bioinformatics shared Docker-based tools registry.
Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier
2015-01-01
Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.
Kotera, Masaaki; Nishimura, Yosuke; Nakagawa, Zen-ichi; Muto, Ai; Moriya, Yuki; Okamoto, Shinobu; Kawashima, Shuichi; Katayama, Toshiaki; Tokimatsu, Toshiaki; Kanehisa, Minoru; Goto, Susumu
2014-12-01
Genomics is faced with the issue of many partially annotated putative enzyme-encoding genes for which activities have not yet been verified, while metabolomics is faced with the issue of many putative enzyme reactions for which full equations have not been verified. Knowledge of enzymes has been collected by IUBMB, and has been made public as the Enzyme List. To date, however, the terminology of the Enzyme List has not been assessed comprehensively by bioinformatics studies. Instead, most of the bioinformatics studies simply use the identifiers of the enzymes, i.e. the Enzyme Commission (EC) numbers. We investigated the actual usage of terminology throughout the Enzyme List, and demonstrated that the partial characteristics of reactions cannot be retrieved by simply using EC numbers. Thus, we developed a novel ontology, named PIERO, for annotating biochemical transformations as follows. First, the terminology describing enzymatic reactions was retrieved from the Enzyme List, and was grouped into those related to overall reactions and biochemical transformations. Consequently, these terms were mapped onto the actual transformations taken from enzymatic reaction equations. This ontology was linked to Gene Ontology (GO) and EC numbers, allowing the extraction of common partial reaction characteristics from given sets of orthologous genes and the elucidation of possible enzymes from the given transformations. Further future development of the PIERO ontology should enhance the Enzyme List to promote the integration of genomics and metabolomics.
BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine
NASA Astrophysics Data System (ADS)
Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu
2016-02-01
Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.
BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine
Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu
2016-01-01
Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm. PMID:26879404
Smith, Andy; Southgate, Joel; Poplawski, Radoslaw; Bull, Matthew J.; Richardson, Emily; Ismail, Matthew; Thompson, Simon Elwood-; Kitchen, Christine; Guest, Martyn; Bakke, Marius
2016-01-01
The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data. PMID:28785418
Connor, Thomas R; Loman, Nicholas J; Thompson, Simon; Smith, Andy; Southgate, Joel; Poplawski, Radoslaw; Bull, Matthew J; Richardson, Emily; Ismail, Matthew; Thompson, Simon Elwood-; Kitchen, Christine; Guest, Martyn; Bakke, Marius; Sheppard, Samuel K; Pallen, Mark J
2016-09-01
The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data.
An ontology-based framework for bioinformatics workflows.
Digiampietri, Luciano A; Perez-Alcazar, Jose de J; Medeiros, Claudia Bauzer
2007-01-01
The proliferation of bioinformatics activities brings new challenges - how to understand and organise these resources, how to exchange and reuse successful experimental procedures, and to provide interoperability among data and tools. This paper describes an effort toward these directions. It is based on combining research on ontology management, AI and scientific workflows to design, reuse and annotate bioinformatics experiments. The resulting framework supports automatic or interactive composition of tasks based on AI planning techniques and takes advantage of ontologies to support the specification and annotation of bioinformatics workflows. We validate our proposal with a prototype running on real data.
PSMB5 plays a dual role in cancer development and immunosuppression
Wang, Chih-Yang; Li, Chung-Yen; Hsu, Hui-Ping; Cho, Chien-Yu; Yen, Meng-Chi; Weng, Tzu-Yang; Chen, Wei-Ching; Hung, Yu-Hsuan; Lee, Kuo-Ting; Hung, Jui-Hsiang; Chen, Yi-Ling; Lai, Ming-Derg
2017-01-01
Tumor progression and metastasis are dependent on the intrinsic properties of tumor cells and the influence of microenvironment including the immune system. It would be important to identify target drug that can inhibit cancer cell and activate immune cells. Proteasome β subunits (PSMB) family, one component of the ubiquitin-proteasome system, has been demonstrated to play an important role in tumor cells and immune cells. Therefore, we used a bioinformatics approach to examine the potential role of PSMB family. Analysis of breast TCGA and METABRIC database revealed that high expression of PSMB5 was observed in breast cancer tissue and that high expression of PSMB5 predicted worse survival. In addition, high expression of PSMB5 was observed in M2 macrophages. Based on our bioinformatics analysis, we hypothesized that PSMB5 contained immunosuppressive and oncogenic characteristics. To study the effects of PSMB5 on the cancer cell and macrophage in vitro, we silenced PSMB5 expression with shRNA in THP-1 monocytes and MDA-MB-231 cells respectively. Knockdown of PSMB5 promoted human THP-1 monocyte differentiation into M1 macrophage. On the other hand, knockdown PSMB5 gene expression inhibited MDA-MB-231 cell growth and migration by colony formation assay and boyden chamber. Collectively, our data demonstrated that delivery of PSMB5 shRNA suppressed cell growth and activated defensive M1 macrophages in vitro. Furthermore, lentiviral delivery of PSMB5 shRNA significantly decreased tumor growth in a subcutaneous mouse model. In conclusion, our bioinformatics study and functional experiments revealed that PSMB5 served as novel cancer therapeutic targets. These results also demonstrated a novel translational approach to improve cancer immunotherapy. PMID:29218236
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application.
D'Antonio, Mattia; D'Onorio De Meo, Paolo; Pallocca, Matteo; Picardi, Ernesto; D'Erchia, Anna Maria; Calogero, Raffaele A; Castrignanò, Tiziana; Pesole, Graziano
2015-01-01
The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs.
Karimkhanloo, Hamzeh; Mohammadi-Yeganeh, Samira; Ahsani, Zeinab; Paryan, Mahdi
2017-04-01
Hepatocellular carcinoma is the major form of primary liver cancer, which is the second and sixth leading cause of cancer-related death in men and women, respectively. Extensive research indicates that Wnt/β-catenin signaling pathway, which plays a pivotal role in growth, development, and differentiation of hepatocellular carcinoma, is one of the major signaling pathways that is dysregulated in hepatocellular carcinoma. Cyclin D1 is a proto-oncogene and is one of the major regulators of Wnt signaling pathway, and its overexpression has been detected in various types of cancers including hepatocellular carcinoma. Using several validated bioinformatic databases, we predicted that the microRNAs are capable of targeting 3'-untranslated region of Cyclin D1 messenger RNA. According to the results, miR-20a was selected as the highest ranking microRNA targeting Cyclin D1 messenger RNA. Luciferase assay was recruited to confirm bioinformatic prediction results. Cyclin D1 expression was first assessed by quantitative real-time polymerase chain reaction in HepG2 cell line. Afterward, HepG2 cells were transduced by lentiviruses containing miR-20a. Then, the expression of miR-20a and Cyclin D1 was evaluated. The results of luciferase assay demonstrated targeting of 3'-untranslated region of Cyclin D1 messenger RNA by miR-20a. Furthermore, 238-fold decline in Cyclin D1 expression was observed after lentiviral induction of miR-20a in HepG2 cells. The results highlighted a considerable effect of miRNA-20a induction on the down-regulation of Cyclin D1 gene. Our results suggest that miR-20a can be used as a novel candidate for therapeutic purposes and a biomarker for hepatocellular carcinoma diagnosis.
Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B
2018-03-24
Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.
Costantini, S; Malerba, G; Contreas, G; Corradi, M; Marin Vargas, S P; Giorgetti, A; Maffeis, C
2015-05-01
Heterozygous loss-of-function mutations in the glucokinase (GCK) gene cause maturity-onset diabetes of the young (MODY) subtype GCK (GCK-MODY/MODY2). GCK sequencing revealed 16 distinct mutations (13 missense, 1 nonsense, 1 splice site, and 1 frameshift-deletion) co-segregating with hyperglycaemia in 23 GCK-MODY families. Four missense substitutions (c.718A>G/p.Asn240Asp, c.757G>T/p.Val253Phe, c.872A>C/p.Lys291Thr, and c.1151C>T/p.Ala384Val) were novel and a founder effect for the nonsense mutation (c.76C>T/p.Gln26*) was supposed. We tested whether an accurate bioinformatics approach could strengthen family-genetic evidence for missense variant pathogenicity in routine diagnostics, where wet-lab functional assays are generally unviable. In silico analyses of the novel missense variants, including orthologous sequence conservation, amino acid substitution (AAS)-pathogenicity predictors, structural modeling and splicing predictors, suggested that the AASs and/or the underlying nucleotide changes are likely to be pathogenic. This study shows how a careful bioinformatics analysis could provide effective suggestions to help molecular-genetic diagnosis in absence of wet-lab validations. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
High-throughput protein analysis integrating bioinformatics and experimental assays
del Val, Coral; Mehrle, Alexander; Falkenhahn, Mechthild; Seiler, Markus; Glatting, Karl-Heinz; Poustka, Annemarie; Suhai, Sandor; Wiemann, Stefan
2004-01-01
The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins. PMID:14762202
Controlling new knowledge: Genomic science, governance and the politics of bioinformatics.
Salter, Brian; Salter, Charlotte
2017-04-01
The rise of bioinformatics is a direct response to the political difficulties faced by genomics in its quest to be a new biomedical innovation, and the value of bioinformatics lies in its role as the bridge between the promise of genomics and its realization in the form of health benefits. Western scientific elites are able to use their close relationship with the state to control and facilitate the emergence of new domains compatible with the existing distribution of epistemic power - all within the embrace of public trust. The incorporation of bioinformatics as the saviour of genomics had to be integrated with the operation of two key aspects of governance in this field: the definition and ownership of the new knowledge. This was achieved mainly by the development of common standards and by the promotion of the values of communality, open access and the public ownership of data to legitimize and maintain the governance power of publicly funded genomic science. Opposition from industry advocating the private ownership of knowledge has been largely neutered through the institutions supporting the science-state concordat. However, in order for translation into health benefits to occur and public trust to be assured, genomic and clinical data have to be integrated and knowledge ownership agreed upon across the separate and distinct governance territories of scientist, clinical medicine and society. Tensions abound as science seeks ways of maintaining its control of knowledge production through the negotiation of new forms of governance with the institutions and values of clinicians and patients.
Ten quick tips for machine learning in computational biology.
Chicco, Davide
2017-01-01
Machine learning has become a pivotal tool for many projects in computational biology, bioinformatics, and health informatics. Nevertheless, beginners and biomedical researchers often do not have enough experience to run a data mining project effectively, and therefore can follow incorrect practices, that may lead to common mistakes or over-optimistic results. With this review, we present ten quick tips to take advantage of machine learning in any computational biology context, by avoiding some common errors that we observed hundreds of times in multiple bioinformatics projects. We believe our ten suggestions can strongly help any machine learning practitioner to carry on a successful project in computational biology and related sciences.
Li, Hongdong; Zhang, Yang; Guan, Yuanfang; Menon, Rajasree; Omenn, Gilbert S
2017-01-01
Tens of thousands of splice isoforms of proteins have been catalogued as predicted sequences from transcripts in humans and other species. Relatively few have been characterized biochemically or structurally. With the extensive development of protein bioinformatics, the characterization and modeling of isoform features, isoform functions, and isoform-level networks have advanced notably. Here we present applications of the I-TASSER family of algorithms for folding and functional predictions and the IsoFunc, MIsoMine, and Hisonet data resources for isoform-level analyses of network and pathway-based functional predictions and protein-protein interactions. Hopefully, predictions and insights from protein bioinformatics will stimulate many experimental validation studies.
Biopython: freely available Python tools for computational molecular biology and bioinformatics
Cock, Peter J. A.; Antao, Tiago; Chang, Jeffrey T.; Chapman, Brad A.; Cox, Cymon J.; Dalke, Andrew; Friedberg, Iddo; Hamelryck, Thomas; Kauff, Frank; Wilczynski, Bartek; de Hoon, Michiel J. L.
2009-01-01
Summary: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. Availability: Biopython is freely available, with documentation and source code at www.biopython.org under the Biopython license. Contact: All queries should be directed to the Biopython mailing lists, see www.biopython.org/wiki/_Mailing_listspeter.cock@scri.ac.uk. PMID:19304878
AphidBase: A centralized bioinformatic resource for annotation of the pea aphid genome
Legeai, Fabrice; Shigenobu, Shuji; Gauthier, Jean-Pierre; Colbourne, John; Rispe, Claude; Collin, Olivier; Richards, Stephen; Wilson, Alex C. C.; Tagu, Denis
2015-01-01
AphidBase is a centralized bioinformatic resource that was developed to facilitate community annotation of the pea aphid genome by the International Aphid Genomics Consortium (IAGC). The AphidBase Information System designed to organize and distribute genomic data and annotations for a large international community was constructed using open source software tools from the Generic Model Organism Database (GMOD). The system includes Apollo and GBrowse utilities as well as a wiki, blast search capabilities and a full text search engine. AphidBase strongly supported community cooperation and coordination in the curation of gene models during community annotation of the pea aphid genome. AphidBase can be accessed at http://www.aphidbase.com. PMID:20482635
Bioinformatics in Middle East Program Curricula--A Focus on the Arabian Gulf
ERIC Educational Resources Information Center
Loucif, Samia
2014-01-01
The purpose of this paper is to investigate the inclusion of bioinformatics in program curricula in the Middle East, focusing on educational institutions in the Arabian Gulf. Bioinformatics is a multidisciplinary field which has emerged in response to the need for efficient data storage and retrieval, and accurate and fast computational and…
ERIC Educational Resources Information Center
Sutcliffe, Iain C.; Cummings, Stephen P.
2007-01-01
Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…
The S-Star Trial Bioinformatics Course: An On-line Learning Success
ERIC Educational Resources Information Center
Lim, Yun Ping; Hoog, Jan-Olov; Gardner, Phyllis; Ranganathan, Shoba; Andersson, Siv; Subbiah, Subramanian; Tan, Tin Wee; Hide, Winston; Weiss, Anthony S.
2003-01-01
The S-Star Trial Bioinformatics on-line course (www.s-star.org) is a global experiment in bioinformatics distance education. Six universities from five continents have participated in this project. One hundred and fifty students participated in the first trial course of which 96 followed through the entire course and 70 fulfilled the overall…
78 FR 35936 - Statement of Organization, Functions, and Delegations of Authority
Federal Register 2010, 2011, 2012, 2013, 2014
2013-06-14
... to, laboratory information systems, quality management systems and bioinformatics; (3) ensures a safe working environment in NCIRD laboratories; and (4) collaborates effectively with other centers and offices...
Mirel, Barbara; Görg, Carsten
2014-04-26
A common class of biomedical analysis is to explore expression data from high throughput experiments for the purpose of uncovering functional relationships that can lead to a hypothesis about mechanisms of a disease. We call this analysis expression driven, -omics hypothesizing. In it, scientists use interactive data visualizations and read deeply in the research literature. Little is known, however, about the actual flow of reasoning and behaviors (sense making) that scientists enact in this analysis, end-to-end. Understanding this flow is important because if bioinformatics tools are to be truly useful they must support it. Sense making models of visual analytics in other domains have been developed and used to inform the design of useful and usable tools. We believe they would be helpful in bioinformatics. To characterize the sense making involved in expression-driven, -omics hypothesizing, we conducted an in-depth observational study of one scientist as she engaged in this analysis over six months. From findings, we abstracted a preliminary sense making model. Here we describe its stages and suggest guidelines for developing visualization tools that we derived from this case. A single case cannot be generalized. But we offer our findings, sense making model and case-based tool guidelines as a first step toward increasing interest and further research in the bioinformatics field on scientists' analytical workflows and their implications for tool design.
2014-01-01
A common class of biomedical analysis is to explore expression data from high throughput experiments for the purpose of uncovering functional relationships that can lead to a hypothesis about mechanisms of a disease. We call this analysis expression driven, -omics hypothesizing. In it, scientists use interactive data visualizations and read deeply in the research literature. Little is known, however, about the actual flow of reasoning and behaviors (sense making) that scientists enact in this analysis, end-to-end. Understanding this flow is important because if bioinformatics tools are to be truly useful they must support it. Sense making models of visual analytics in other domains have been developed and used to inform the design of useful and usable tools. We believe they would be helpful in bioinformatics. To characterize the sense making involved in expression-driven, -omics hypothesizing, we conducted an in-depth observational study of one scientist as she engaged in this analysis over six months. From findings, we abstracted a preliminary sense making model. Here we describe its stages and suggest guidelines for developing visualization tools that we derived from this case. A single case cannot be generalized. But we offer our findings, sense making model and case-based tool guidelines as a first step toward increasing interest and further research in the bioinformatics field on scientists’ analytical workflows and their implications for tool design. PMID:24766796
iEnhancer-EL: Identifying enhancers and their strength with ensemble learning approach.
Liu, Bin; Li, Kai; Huang, De-Shuang; Chou, Kuo-Chen
2018-06-07
Identification of enhancers and their strength is important because they play a critical role in controlling gene expression. Although some bioinformatics tools were developed, they are limited in discriminating enhancers from non-enhancers only. Recently, a two-layer predictor called "iEnhancer-2L" was developed that can be used to predict the enhancer's strength as well. However, its prediction quality needs further improvement to enhance the practical application value. A new predictor called "iEnhancer-EL" was proposed that contains two layer predictors: the first one (for identifying enhancers) is formed by fusing an array of six key individual classifiers, and the second one (for their strength) formed by fusing an array of ten key individual classifiers. All these key classifiers were selected from 171 elementary classifiers formed by SVM (Support Vector Machine) based on kmer, subsequence profile, and PseKNC (Pseudo K-tuple Nucleotide Composition), respectively. Rigorous cross-validations have indicated that the proposed predictor is remarkably superior to the existing state-of-the-art one in this area. A web server for the iEnhancer-EL has been established at http://bioinformatics.hitsz.edu.cn/iEnhancer-EL/, by which users can easily get their desired results without the need to go through the mathematical details. bliu@hit.edu.cn, dshuang@tongji.edu.cn or kcchou@gordonlifescience.org. Supplementary data are available at Bioinformatics online.
COEUS: “semantic web in a box” for biomedical applications
2012-01-01
Background As the “omics” revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter’s complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. Results COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a “semantic web in a box” approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. Conclusions The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/. PMID:23244467
COEUS: "semantic web in a box" for biomedical applications.
Lopes, Pedro; Oliveira, José Luís
2012-12-17
As the "omics" revolution unfolds, the growth in data quantity and diversity is bringing about the need for pioneering bioinformatics software, capable of significantly improving the research workflow. To cope with these computer science demands, biomedical software engineers are adopting emerging semantic web technologies that better suit the life sciences domain. The latter's complex relationships are easily mapped into semantic web graphs, enabling a superior understanding of collected knowledge. Despite increased awareness of semantic web technologies in bioinformatics, their use is still limited. COEUS is a new semantic web framework, aiming at a streamlined application development cycle and following a "semantic web in a box" approach. The framework provides a single package including advanced data integration and triplification tools, base ontologies, a web-oriented engine and a flexible exploration API. Resources can be integrated from heterogeneous sources, including CSV and XML files or SQL and SPARQL query results, and mapped directly to one or more ontologies. Advanced interoperability features include REST services, a SPARQL endpoint and LinkedData publication. These enable the creation of multiple applications for web, desktop or mobile environments, and empower a new knowledge federation layer. The platform, targeted at biomedical application developers, provides a complete skeleton ready for rapid application deployment, enhancing the creation of new semantic information systems. COEUS is available as open source at http://bioinformatics.ua.pt/coeus/.
Borrel, Alexandre; Fourches, Denis
2017-12-01
There is a growing interest for the broad use of Augmented Reality (AR) and Virtual Reality (VR) in the fields of bioinformatics and cheminformatics to visualize complex biological and chemical structures. AR and VR technologies allow for stunning and immersive experiences, offering untapped opportunities for both research and education purposes. However, preparing 3D models ready to use for AR and VR is time-consuming and requires a technical expertise that severely limits the development of new contents of potential interest for structural biologists, medicinal chemists, molecular modellers and teachers. Herein we present the RealityConvert software tool and associated website, which allow users to easily convert molecular objects to high quality 3D models directly compatible for AR and VR applications. For chemical structures, in addition to the 3D model generation, RealityConvert also generates image trackers, useful to universally call and anchor that particular 3D model when used in AR applications. The ultimate goal of RealityConvert is to facilitate and boost the development and accessibility of AR and VR contents for bioinformatics and cheminformatics applications. http://www.realityconvert.com. dfourch@ncsu.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools
Diaz Acosta, B.
2011-01-01
The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.
Bioinformatics in high school biology curricula: a study of state science standards.
Wefer, Stephen H; Sheppard, Keith
2008-01-01
The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students.
Cheng, Gong; Lu, Quan; Ma, Ling; Zhang, Guocai; Xu, Liang; Zhou, Zongshan
2017-01-01
Recently, Docker technology has received increasing attention throughout the bioinformatics community. However, its implementation has not yet been mastered by most biologists; accordingly, its application in biological research has been limited. In order to popularize this technology in the field of bioinformatics and to promote the use of publicly available bioinformatics tools, such as Dockerfiles and Images from communities, government sources, and private owners in the Docker Hub Registry and other Docker-based resources, we introduce here a complete and accurate bioinformatics workflow based on Docker. The present workflow enables analysis and visualization of pan-genomes and biosynthetic gene clusters of bacteria. This provides a new solution for bioinformatics mining of big data from various publicly available biological databases. The present step-by-step guide creates an integrative workflow through a Dockerfile to allow researchers to build their own Image and run Container easily.
Cheng, Gong; Zhang, Guocai; Xu, Liang
2017-01-01
Recently, Docker technology has received increasing attention throughout the bioinformatics community. However, its implementation has not yet been mastered by most biologists; accordingly, its application in biological research has been limited. In order to popularize this technology in the field of bioinformatics and to promote the use of publicly available bioinformatics tools, such as Dockerfiles and Images from communities, government sources, and private owners in the Docker Hub Registry and other Docker-based resources, we introduce here a complete and accurate bioinformatics workflow based on Docker. The present workflow enables analysis and visualization of pan-genomes and biosynthetic gene clusters of bacteria. This provides a new solution for bioinformatics mining of big data from various publicly available biological databases. The present step-by-step guide creates an integrative workflow through a Dockerfile to allow researchers to build their own Image and run Container easily. PMID:29204317
[Integration of clinical and biological data in clinical practice using bioinformatics].
Coltell, Oscar; Arregui, María; Fabregat, Antonio; Portolés, Olga
2008-05-01
The aim of our work is to describe essential aspects of Medical Informatics, Bioinformatics and Biomedical Informatics, that are used in biomedical research and clinical practice. These disciplines have emerged from the need to find new scientific and technical approaches to manage, store, analyze and report data generated in clinical practice and molecular biology and other medical specialties. It can be also useful to integrate research information generated in different areas of health care. Moreover, these disciplines are interdisciplinary and integrative, two key features not shared by other areas of medical knowledge. Finally, when Bioinformatics and Biomedical Informatics approach to medical investigation and practice are applied, a new discipline, called Clinical Bioinformatics, emerges. The latter requires a specific training program to create a new professional profile. We have not been able to find a specific training program in Clinical Bioinformatics in Spain.
Bioinformatics in High School Biology Curricula: A Study of State Science Standards
Sheppard, Keith
2008-01-01
The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students. PMID:18316818
Bioinformatics and molecular modeling in glycobiology
Schloissnig, Siegfried
2010-01-01
The field of glycobiology is concerned with the study of the structure, properties, and biological functions of the family of biomolecules called carbohydrates. Bioinformatics for glycobiology is a particularly challenging field, because carbohydrates exhibit a high structural diversity and their chains are often branched. Significant improvements in experimental analytical methods over recent years have led to a tremendous increase in the amount of carbohydrate structure data generated. Consequently, the availability of databases and tools to store, retrieve and analyze these data in an efficient way is of fundamental importance to progress in glycobiology. In this review, the various graphical representations and sequence formats of carbohydrates are introduced, and an overview of newly developed databases, the latest developments in sequence alignment and data mining, and tools to support experimental glycan analysis are presented. Finally, the field of structural glycoinformatics and molecular modeling of carbohydrates, glycoproteins, and protein–carbohydrate interaction are reviewed. PMID:20364395
Tools for visually exploring biological networks.
Suderman, Matthew; Hallett, Michael
2007-10-15
Many tools exist for visually exploring biological networks including well-known examples such as Cytoscape, VisANT, Pathway Studio and Patika. These systems play a key role in the development of integrative biology, systems biology and integrative bioinformatics. The trend in the development of these tools is to go beyond 'static' representations of cellular state, towards a more dynamic model of cellular processes through the incorporation of gene expression data, subcellular localization information and time-dependent behavior. We provide a comprehensive review of the relative advantages and disadvantages of existing systems with two goals in mind: to aid researchers in efficiently identifying the appropriate existing tools for data visualization; to describe the necessary and realistic goals for the next generation of visualization tools. In view of the first goal, we provide in the Supplementary Material a systematic comparison of more than 35 existing tools in terms of over 25 different features. Supplementary data are available at Bioinformatics online.
Design and Development of ChemInfoCloud: An Integrated Cloud Enabled Platform for Virtual Screening.
Karthikeyan, Muthukumarasamy; Pandit, Deepak; Bhavasar, Arvind; Vyas, Renu
2015-01-01
The power of cloud computing and distributed computing has been harnessed to handle vast and heterogeneous data required to be processed in any virtual screening protocol. A cloud computing platorm ChemInfoCloud was built and integrated with several chemoinformatics and bioinformatics tools. The robust engine performs the core chemoinformatics tasks of lead generation, lead optimisation and property prediction in a fast and efficient manner. It has also been provided with some of the bioinformatics functionalities including sequence alignment, active site pose prediction and protein ligand docking. Text mining, NMR chemical shift (1H, 13C) prediction and reaction fingerprint generation modules for efficient lead discovery are also implemented in this platform. We have developed an integrated problem solving cloud environment for virtual screening studies that also provides workflow management, better usability and interaction with end users using container based virtualization, OpenVz.
Exploring Wound-Healing Genomic Machinery with a Network-Based Approach
Vitali, Francesca; Marini, Simone; Balli, Martina; Grosemans, Hanne; Sampaolesi, Maurilio; Lussier, Yves A.; Cusella De Angelis, Maria Gabriella; Bellazzi, Riccardo
2017-01-01
The molecular mechanisms underlying tissue regeneration and wound healing are still poorly understood despite their importance. In this paper we develop a bioinformatics approach, combining biology and network theory to drive experiments for better understanding the genetic underpinnings of wound healing mechanisms and for selecting potential drug targets. We start by selecting literature-relevant genes in murine wound healing, and inferring from them a Protein-Protein Interaction (PPI) network. Then, we analyze the network to rank wound healing-related genes according to their topological properties. Lastly, we perform a procedure for in-silico simulation of a treatment action in a biological pathway. The findings obtained by applying the developed pipeline, including gene expression analysis, confirms how a network-based bioinformatics method is able to prioritize candidate genes for in vitro analysis, thus speeding up the understanding of molecular mechanisms and supporting the discovery of potential drug targets. PMID:28635674
Pineda, Sandy S; Chaumeil, Pierre-Alain; Kunert, Anne; Kaas, Quentin; Thang, Mike W C; Le, Lien; Nuhn, Michael; Herzig, Volker; Saez, Natalie J; Cristofori-Armstrong, Ben; Anangi, Raveendra; Senff, Sebastian; Gorse, Dominique; King, Glenn F
2018-03-15
ArachnoServer is a manually curated database that consolidates information on the sequence, structure, function and pharmacology of spider-venom toxins. Although spider venoms are complex chemical arsenals, the primary constituents are small disulfide-bridged peptides that target neuronal ion channels and receptors. Due to their high potency and selectivity, these peptides have been developed as pharmacological tools, bioinsecticides and drug leads. A new version of ArachnoServer (v3.0) has been developed that includes a bioinformatics pipeline for automated detection and analysis of peptide toxin transcripts in assembled venom-gland transcriptomes. ArachnoServer v3.0 was updated with the latest sequence, structure and functional data, the search-by-mass feature has been enhanced, and toxin cards provide additional information about each mature toxin. http://arachnoserver.org. support@arachnoserver.org. Supplementary data are available at Bioinformatics online.
Public data and open source tools for multi-assay genomic investigation of disease.
Kannan, Lavanya; Ramos, Marcel; Re, Angela; El-Hachem, Nehme; Safikhani, Zhaleh; Gendoo, Deena M A; Davis, Sean; Gomez-Cabrero, David; Castelo, Robert; Hansen, Kasper D; Carey, Vincent J; Morgan, Martin; Culhane, Aedín C; Haibe-Kains, Benjamin; Waldron, Levi
2016-07-01
Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources; however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for accessing publicly available data and tools to support development of needed integrative methods. © The Author 2015. Published by Oxford University Press.
Han, Rowland H.; Wang, Miao; Fang, Xiaoling; Han, Xianlin
2013-01-01
Although the synthesis pathways of intracellular triacylglycerol (TAG) species have been well elucidated, assessment of the contribution of an individual pathway to TAG pools in different mammalian organs, particularly under pathophysiological conditions, is difficult, although not impossible. Herein, we developed and validated a novel bioinformatic approach to assess the differential contributions of the known pathways to TAG pools through simulation of TAG ion profiles determined by shotgun lipidomics. This powerful approach was applied to determine such contributions in mouse heart, liver, and skeletal muscle and to examine the changes of these pathways in mouse liver induced after treatment with a high-fat diet. It was clearly demonstrated that assessment of the altered TAG biosynthesis pathways under pathophysiological conditions can be readily achieved through simulation of lipidomics data. Collectively, this new development should greatly facilitate our understanding of the biochemical mechanisms underpinning TAG accumulation at the states of obesity and lipotoxicity. PMID:23365150
Unity in defence: honeybee workers exhibit conserved molecular responses to diverse pathogens.
Doublet, Vincent; Poeschl, Yvonne; Gogol-Döring, Andreas; Alaux, Cédric; Annoscia, Desiderato; Aurori, Christian; Barribeau, Seth M; Bedoya-Reina, Oscar C; Brown, Mark J F; Bull, James C; Flenniken, Michelle L; Galbraith, David A; Genersch, Elke; Gisder, Sebastian; Grosse, Ivo; Holt, Holly L; Hultmark, Dan; Lattorff, H Michael G; Le Conte, Yves; Manfredini, Fabio; McMahon, Dino P; Moritz, Robin F A; Nazzi, Francesco; Niño, Elina L; Nowick, Katja; van Rij, Ronald P; Paxton, Robert J; Grozinger, Christina M
2017-03-02
Organisms typically face infection by diverse pathogens, and hosts are thought to have developed specific responses to each type of pathogen they encounter. The advent of transcriptomics now makes it possible to test this hypothesis and compare host gene expression responses to multiple pathogens at a genome-wide scale. Here, we performed a meta-analysis of multiple published and new transcriptomes using a newly developed bioinformatics approach that filters genes based on their expression profile across datasets. Thereby, we identified common and unique molecular responses of a model host species, the honey bee (Apis mellifera), to its major pathogens and parasites: the Microsporidia Nosema apis and Nosema ceranae, RNA viruses, and the ectoparasitic mite Varroa destructor, which transmits viruses. We identified a common suite of genes and conserved molecular pathways that respond to all investigated pathogens, a result that suggests a commonality in response mechanisms to diverse pathogens. We found that genes differentially expressed after infection exhibit a higher evolutionary rate than non-differentially expressed genes. Using our new bioinformatics approach, we unveiled additional pathogen-specific responses of honey bees; we found that apoptosis appeared to be an important response following microsporidian infection, while genes from the immune signalling pathways, Toll and Imd, were differentially expressed after Varroa/virus infection. Finally, we applied our bioinformatics approach and generated a gene co-expression network to identify highly connected (hub) genes that may represent important mediators and regulators of anti-pathogen responses. Our meta-analysis generated a comprehensive overview of the host metabolic and other biological processes that mediate interactions between insects and their pathogens. We identified key host genes and pathways that respond to phylogenetically diverse pathogens, representing an important source for future functional studies as well as offering new routes to identify or generate pathogen resilient honey bee stocks. The statistical and bioinformatics approaches that were developed for this study are broadly applicable to synthesize information across transcriptomic datasets. These approaches will likely have utility in addressing a variety of biological questions.
No-boundary thinking in bioinformatics research
2013-01-01
Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT). PMID:24192339
ERIC Educational Resources Information Center
Barker, Daniel; Alderson, Rosanna G.; McDonagh, James L.; Plaisier, Heleen; Comrie, Muriel M.; Duncan, Leigh; Muirhead, Gavin T. P.; Sweeney, Stuart D.
2015-01-01
Background: Bioinformatics--the use of computers in biology--is of major and increasing importance to biological sciences and medicine. We conducted a preliminary investigation of the value of bringing practical, university-level bioinformatics education to the school level. We conducted voluntary activities for pupils at two schools in Scotland…
The Air Force In Silico -- Computational Biology in 2025
2007-11-01
and chromosome) these new fields are commonly referred to as “~omics.” Proteomics , transcriptomics, metabolomics , epigenomics, physiomics... Bioinformatics , 2006, http://journal.imbio.de/ http://www-bm.ipk-gatersleben.de/stable/php/ journal /articles/pdf/jib-22.pdf (accessed 30 September...Chirino, G. Tansley and I. Dryden, “The implications for Bioinformatics of integration across physical scales,” Journal of Integrative Bioinformatics
Online Tools for Bioinformatics Analyses in Nutrition Sciences12
Malkaram, Sridhar A.; Hassan, Yousef I.; Zempleni, Janos
2012-01-01
Recent advances in “omics” research have resulted in the creation of large datasets that were generated by consortiums and centers, small datasets that were generated by individual investigators, and bioinformatics tools for mining these datasets. It is important for nutrition laboratories to take full advantage of the analysis tools to interrogate datasets for information relevant to genomics, epigenomics, transcriptomics, proteomics, and metabolomics. This review provides guidance regarding bioinformatics resources that are currently available in the public domain, with the intent to provide a starting point for investigators who want to take advantage of the opportunities provided by the bioinformatics field. PMID:22983844
Weisman, David
2010-01-01
Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a face-to-face lecture course with a web-based virtual laboratory presents new opportunities for collaborative learning of the conceptual material, and for fostering peer support of technical bioinformatics questions. To explore this combination, an in-person lecture-only undergraduate bioinformatics course was augmented with a remote web-based laboratory, and tested with a large class. This study hypothesized that the collaborative virtual lab would foster active learning and peer support, and tested this hypothesis by conducting a student survey near the end of the semester. Respondents broadly reported strong benefits from the online laboratory, and strong benefits from peer-provided technical support. In comparison with traditional in-person teaching labs, students preferred the virtual lab by a factor of two. Key aspects of the course architecture and design are described to encourage further experimentation in teaching collaborative online bioinformatics laboratories. Copyright © 2010 International Union of Biochemistry and Molecular Biology, Inc.
Taking Bioinformatics to Systems Medicine.
van Kampen, Antoine H C; Moerland, Perry D
2016-01-01
Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.
Walther, Stefanie; Tietze, Manfred; Czerny, Claus-Peter; König, Sven; Diesterbeck, Ulrike S
2016-01-01
We have developed a new bioinformatics framework for the analysis of rearranged bovine heavy chain immunoglobulin (Ig) variable regions by combining and refining widely used alignment algorithms. This bioinformatics framework allowed us to investigate alignments of heavy chain framework regions (FRHs) and the separate alignments of FRHs and heavy chain complementarity determining regions (CDRHs) to determine their germline origin in the four cattle breeds Aubrac, German Black Pied, German Simmental, and Holstein Friesian. Now it is also possible to specifically analyze Ig heavy chains possessing exceptionally long CDR3Hs. In order to gain more insight into breed specific differences in Ig combinatorial diversity, somatic hypermutations and putative gene conversions of IgG, we compared the dominantly transcribed variable (IGHV), diversity (IGHD), and joining (IGHJ) segments and their recombination in the four cattle breeds. The analysis revealed the use of 15 different IGHV segments, 21 IGHD segments, and two IGHJ segments with significant different transcription levels within the breeds. Furthermore, there are preferred rearrangements within the three groups of CDR3H lengths. In the sequences of group 2 (CDR3H lengths (L) of 11-47 amino acid residues (aa)) a higher number of recombination was observed than in sequences of group 1 (L≤10 aa) and 3 (L≥48 aa). The combinatorial diversity of germline IGHV, IGHD, and IGHJ-segments revealed 162 rearrangements that were significantly different. The few preferably rearranged gene segments within group 3 CDR3H regions may indicate specialized antibodies because this length is unique in cattle. The most important finding of this study, which was enabled by using the bioinformatics framework, is the discovery of strong evidence for gene conversion as a rare event using pseudogenes fulfilling all definitions for this particular diversification mechanism.
Martin, Guillaume; Baurens, Franc-Christophe; Droc, Gaëtan; Rouard, Mathieu; Cenci, Alberto; Kilian, Andrzej; Hastie, Alex; Doležel, Jaroslav; Aury, Jean-Marc; Alberti, Adriana; Carreel, Françoise; D'Hont, Angélique
2016-03-16
Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata). We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80%), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5% of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70%. Unknown sites (N) were reduced from 17.3 to 10.0%. The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species.
Czerny, Claus-Peter; König, Sven; Diesterbeck, Ulrike S.
2016-01-01
We have developed a new bioinformatics framework for the analysis of rearranged bovine heavy chain immunoglobulin (Ig) variable regions by combining and refining widely used alignment algorithms. This bioinformatics framework allowed us to investigate alignments of heavy chain framework regions (FRHs) and the separate alignments of FRHs and heavy chain complementarity determining regions (CDRHs) to determine their germline origin in the four cattle breeds Aubrac, German Black Pied, German Simmental, and Holstein Friesian. Now it is also possible to specifically analyze Ig heavy chains possessing exceptionally long CDR3Hs. In order to gain more insight into breed specific differences in Ig combinatorial diversity, somatic hypermutations and putative gene conversions of IgG, we compared the dominantly transcribed variable (IGHV), diversity (IGHD), and joining (IGHJ) segments and their recombination in the four cattle breeds. The analysis revealed the use of 15 different IGHV segments, 21 IGHD segments, and two IGHJ segments with significant different transcription levels within the breeds. Furthermore, there are preferred rearrangements within the three groups of CDR3H lengths. In the sequences of group 2 (CDR3H lengths (L) of 11–47 amino acid residues (aa)) a higher number of recombination was observed than in sequences of group 1 (L≤10 aa) and 3 (L≥48 aa). The combinatorial diversity of germline IGHV, IGHD, and IGHJ-segments revealed 162 rearrangements that were significantly different. The few preferably rearranged gene segments within group 3 CDR3H regions may indicate specialized antibodies because this length is unique in cattle. The most important finding of this study, which was enabled by using the bioinformatics framework, is the discovery of strong evidence for gene conversion as a rare event using pseudogenes fulfilling all definitions for this particular diversification mechanism. PMID:27828971
Video bioinformatics analysis of human embryonic stem cell colony growth.
Lin, Sabrina; Fonteno, Shawn; Satish, Shruthi; Bhanu, Bir; Talbot, Prue
2010-05-20
Because video data are complex and are comprised of many images, mining information from video material is difficult to do without the aid of computer software. Video bioinformatics is a powerful quantitative approach for extracting spatio-temporal data from video images using computer software to perform dating mining and analysis. In this article, we introduce a video bioinformatics method for quantifying the growth of human embryonic stem cells (hESC) by analyzing time-lapse videos collected in a Nikon BioStation CT incubator equipped with a camera for video imaging. In our experiments, hESC colonies that were attached to Matrigel were filmed for 48 hours in the BioStation CT. To determine the rate of growth of these colonies, recipes were developed using CL-Quant software which enables users to extract various types of data from video images. To accurately evaluate colony growth, three recipes were created. The first segmented the image into the colony and background, the second enhanced the image to define colonies throughout the video sequence accurately, and the third measured the number of pixels in the colony over time. The three recipes were run in sequence on video data collected in a BioStation CT to analyze the rate of growth of individual hESC colonies over 48 hours. To verify the truthfulness of the CL-Quant recipes, the same data were analyzed manually using Adobe Photoshop software. When the data obtained using the CL-Quant recipes and Photoshop were compared, results were virtually identical, indicating the CL-Quant recipes were truthful. The method described here could be applied to any video data to measure growth rates of hESC or other cells that grow in colonies. In addition, other video bioinformatics recipes can be developed in the future for other cell processes such as migration, apoptosis, and cell adhesion.
Biosensor Recognition Elements
2008-01-01
Systematics, bioinformatics, systems biology, regulation, genetics, genomics, metabolism, ecology, development . Epstein - Barr Virus Latency and...and C, Simian immunodeficiency, Ebola, Rabies, Epstein – Barr , and Measles viruses as well as biological agents such as botulinum neurotoxin A/B...time metabolic vigilance via sensor based ligand specific biorecognition elements is immense. Virus -based nanoparticles have been developed for
The Top Five “Game Changers” in Vaccinology: Toward Rational and Directed Vaccine Development
Kennedy, Richard B.
2011-01-01
Abstract Despite the tremendous success of the classical “isolate, inactivate, and inject” approach to vaccine development, new breakthroughs in vaccine research are increasingly reliant on novel approaches that incorporate cutting edge technology and advances in innate and adaptive immunology, microbiology, virology, pathogen biology, genetics, bioinformatics, and many other disciplines in order to: (1) deepen our understanding of the key biological processes that lead to protective immunity, (2) observe vaccine responses on a global, systems level, and (3) directly apply the new knowledge gained to the development of next-generation vaccines with improved safety profiles, enhanced efficacy, and even targeted utility in select populations. Here we highlight five key components foundational to vaccinomics efforts: applied immunogenomics, next generation sequencing and other cutting-edge “omics” technologies, advanced bioinformatics and analysis techniques, and finally, systems biology applied to immune profiling and vaccine responses. We believe these “game changers” will play a critical role in moving us toward the rational and directed development of new vaccines in the 21st century. PMID:21815811
A new paradigm for transcription factor TFIIB functionality
Gelev, Vladimir; Zabolotny, Janice M.; Lange, Martin; Hiromura, Makoto; Yoo, Sang Wook; Orlando, Joseph S.; Kushnir, Anna; Horikoshi, Nobuo; Paquet, Eric; Bachvarov, Dimcho; Schaffer, Priscilla A.; Usheva, Anny
2014-01-01
Experimental and bioinformatic studies of transcription initiation by RNA polymerase II (RNAP2) have revealed a mechanism of RNAP2 transcription initiation less uniform across gene promoters than initially thought. However, the general transcription factor TFIIB is presumed to be universally required for RNAP2 transcription initiation. Based on bioinformatic analysis of data and effects of TFIIB knockdown in primary and transformed cell lines on cellular functionality and global gene expression, we report that TFIIB is dispensable for transcription of many human promoters, but is essential for herpes simplex virus-1 (HSV-1) gene transcription and replication. We report a novel cell cycle TFIIB regulation and localization of the acetylated TFIIB variant on the transcriptionally silent mitotic chromatids. Taken together, these results establish a new paradigm for TFIIB functionality in human gene expression, which when downregulated has potent anti-viral effects. PMID:24441171
De Oliveira, T; Miller, R; Tarin, M; Cassol, S
2003-01-01
Sequence databases encode a wealth of information needed to develop improved vaccination and treatment strategies for the control of HIV and other important pathogens. To facilitate effective utilization of these datasets, we developed a user-friendly GDE-based LINUX interface that reduces input/output file formatting. GDE was adapted to the Linux operating system, bioinformatics tools were integrated with microbe-specific databases, and up-to-date GDE menus were developed for several clinically important viral, bacterial and parasitic genomes. Each microbial interface was designed for local access and contains Genbank, BLAST-formatted and phylogenetic databases. GDE-Linux is available for research purposes by direct application to the corresponding author. Application-specific menus and support files can be downloaded from (http://www.bioafrica.net).
Regnström, Karin J
2008-01-01
The development of vaccines, conventional protein based as well as nucleic acid based vaccines, and their delivery systems has been largely empirical and ineffective. This is partly due to a lack of methodology, since traditionally only a few markers are studied. By introducing gene expression analysis and bioinformatics into the design of vaccines and their delivery systems, vaccine development can be improved and accelerated considerably. Each vaccine antigen and delivery system combination is characterized by a unique genomic profile, a "fingerprint" that will give information of not only immunological and toxicological responses but also other related cellular responses e.g. cell cycle, apoptosis and carcinogenic effects. The resulting unique genomic fingerprint facilitates the establishment of molecular structure--pharmacological activity relationships and therefore leads to optimization of vaccine development.
Wei, Jiangyong; Hu, Xiaohua; Zou, Xiufen; Tian, Tianhai
2017-12-28
Recent advances in omics technologies have raised great opportunities to study large-scale regulatory networks inside the cell. In addition, single-cell experiments have measured the gene and protein activities in a large number of cells under the same experimental conditions. However, a significant challenge in computational biology and bioinformatics is how to derive quantitative information from the single-cell observations and how to develop sophisticated mathematical models to describe the dynamic properties of regulatory networks using the derived quantitative information. This work designs an integrated approach to reverse-engineer gene networks for regulating early blood development based on singel-cell experimental observations. The wanderlust algorithm is initially used to develop the pseudo-trajectory for the activities of a number of genes. Since the gene expression data in the developed pseudo-trajectory show large fluctuations, we then use Gaussian process regression methods to smooth the gene express data in order to obtain pseudo-trajectories with much less fluctuations. The proposed integrated framework consists of both bioinformatics algorithms to reconstruct the regulatory network and mathematical models using differential equations to describe the dynamics of gene expression. The developed approach is applied to study the network regulating early blood cell development. A graphic model is constructed for a regulatory network with forty genes and a dynamic model using differential equations is developed for a network of nine genes. Numerical results suggests that the proposed model is able to match experimental data very well. We also examine the networks with more regulatory relations and numerical results show that more regulations may exist. We test the possibility of auto-regulation but numerical simulations do not support the positive auto-regulation. In addition, robustness is used as an importantly additional criterion to select candidate networks. The research results in this work shows that the developed approach is an efficient and effective method to reverse-engineer gene networks using single-cell experimental observations.