Science.gov

Sample records for personal genomics bioinformatics

  1. Bioinformatics for personal genome interpretation.

    PubMed

    Capriotti, Emidio; Nehrt, Nathan L; Kann, Maricel G; Bromberg, Yana

    2012-07-01

    An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.

  2. Genome Exploitation and Bioinformatics Tools

    NASA Astrophysics Data System (ADS)

    de Jong, Anne; van Heel, Auke J.; Kuipers, Oscar P.

    Bioinformatic tools can greatly improve the efficiency of bacteriocin screening efforts by limiting the amount of strains. Different classes of bacteriocins can be detected in genomes by looking at different features. Finding small bacteriocins can be especially challenging due to low homology and because small open reading frames (ORFs) are often omitted from annotations. In this chapter, several bioinformatic tools/strategies to identify bacteriocins in genomes are discussed.

  3. Genomics, molecular imaging, bioinformatics, and bio-nano-info integration are synergistic components of translational medicine and personalized healthcare research

    PubMed Central

    2008-01-01

    Supported by National Science Foundation (NSF), International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design and International Journal of Functional Informatics and Personalized Medicine, IEEE 7th Bioinformatics and Bioengineering attracted more than 600 papers and 500 researchers and medical doctors. It was the only synergistic inter/multidisciplinary IEEE conference with 24 Keynote Lectures, 7 Tutorials, 5 Cutting-Edge Research Workshops and 32 Scientific Sessions including 11 Special Research Interest Sessions that were designed dynamically at Harvard in response to the current research trends and advances. The committee was very grateful for the IEEE Plenary Keynote Lectures given by: Dr. A. Keith Dunker (Indiana), Dr. Jun Liu (Harvard), Dr. Brian Athey (Michigan), Dr. Mark Borodovsky (Georgia Tech and President of ISIBM), Dr. Hamid Arabnia (Georgia and Vice-President of ISIBM), Dr. Ruzena Bajcsy (Berkeley and Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Chih-Ming Ho (UCLA and Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Andy Baxevanis (United States National Institutes of Health), Dr. Arif Ghafoor (Purdue), Dr. John Quackenbush (Harvard), Dr. Eric Jakobsson (UIUC), Dr. Vladimir Uversky (Indiana), Dr. Laura Elnitski (United States National Institutes of Health) and other world-class scientific leaders. The Harvard meeting was a large academic event 100% full-sponsored by IEEE financially and academically. After a rigorous peer-review process, the committee selected 27 high-quality research papers from 600 submissions. The committee is grateful for contributions from keynote speakers Dr. Russ Altman (IEEE BIBM conference keynote lecturer on combining simulation and machine

  4. Genomics, molecular imaging, bioinformatics, and bio-nano-info integration are synergistic components of translational medicine and personalized healthcare research.

    PubMed

    Yang, Jack Y; Yang, Mary Qu; Arabnia, Hamid R; Deng, Youping

    2008-09-16

    Supported by National Science Foundation (NSF), International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design and International Journal of Functional Informatics and Personalized Medicine, IEEE 7th Bioinformatics and Bioengineering attracted more than 600 papers and 500 researchers and medical doctors. It was the only synergistic inter/multidisciplinary IEEE conference with 24 Keynote Lectures, 7 Tutorials, 5 Cutting-Edge Research Workshops and 32 Scientific Sessions including 11 Special Research Interest Sessions that were designed dynamically at Harvard in response to the current research trends and advances. The committee was very grateful for the IEEE Plenary Keynote Lectures given by: Dr. A. Keith Dunker (Indiana), Dr. Jun Liu (Harvard), Dr. Brian Athey (Michigan), Dr. Mark Borodovsky (Georgia Tech and President of ISIBM), Dr. Hamid Arabnia (Georgia and Vice-President of ISIBM), Dr. Ruzena Bajcsy (Berkeley and Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Chih-Ming Ho (UCLA and Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Andy Baxevanis (United States National Institutes of Health), Dr. Arif Ghafoor (Purdue), Dr. John Quackenbush (Harvard), Dr. Eric Jakobsson (UIUC), Dr. Vladimir Uversky (Indiana), Dr. Laura Elnitski (United States National Institutes of Health) and other world-class scientific leaders. The Harvard meeting was a large academic event 100% full-sponsored by IEEE financially and academically. After a rigorous peer-review process, the committee selected 27 high-quality research papers from 600 submissions. The committee is grateful for contributions from keynote speakers Dr. Russ Altman (IEEE BIBM conference keynote lecturer on combining simulation and machine

  5. Bioinformatics Approach in Plant Genomic Research

    PubMed Central

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-01-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  6. Genomics and Bioinformatics of Parkinson's Disease

    PubMed Central

    Scholz, Sonja W.; Mhyre, Tim; Ressom, Habtom; Shah, Salim; Federoff, Howard J.

    2012-01-01

    Within the last two decades, genomics and bioinformatics have profoundly impacted our understanding of the molecular mechanisms of Parkinson's disease (PD). From the description of the first PD gene in 1997 until today, we have witnessed the emergence of new technologies that have revolutionized our concepts to identify genetic mechanisms implicated in human health and disease. Driven by the publication of the human genome sequence and followed by the description of detailed maps for common genetic variability, novel applications to rapidly scrutinize the entire genome in a systematic, cost-effective manner have become a reality. As a consequence, about 30 genetic loci have been unequivocally linked to the pathogenesis of PD highlighting essential molecular pathways underlying this common disorder. Herein we discuss how neurogenomics and bioinformatics are applied to dissect the nature of this complex disease with the overall aim of developing rational therapeutic interventions. PMID:22762024

  7. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  8. Translational bioinformatics applications in genome medicine

    PubMed Central

    2009-01-01

    Although investigators using methodologies in bioinformatics have always been useful in genomic experimentation in analytic, engineering, and infrastructure support roles, only recently have bioinformaticians been able to have a primary scientific role in asking and answering questions on human health and disease. Here, I argue that this shift in role towards asking questions in medicine is now the next step needed for the field of bioinformatics. I outline four reasons why bioinformaticians are newly enabled to drive the questions in primary medical discovery: public availability of data, intersection of data across experiments, commoditization of methods, and streamlined validation. I also list four recommendations for bioinformaticians wishing to get more involved in translational research. PMID:19566916

  9. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  10. Advances in translational bioinformatics and population genomics in the Asia-Pacific.

    PubMed

    Ranganathan, Shoba; Tongsima, Sissades; Chan, Jonathan; Tan, Tin Wee; Schönbach, Christian

    2012-01-01

    The theme of the 2012 International Conference on Bioinformatics (InCoB) in Bangkok, Thailand was "From Biological Data to Knowledge to Technological Breakthroughs." Besides providing a forum for life scientists and bioinformatics researchers in the Asia-Pacific region to meet and interact, the conference also hosted thematic sessions on the Pan-Asian Pacific Genome Initiative and immunoinformatics. Over the seven years of conference papers published in BMC Bioinformatics and four years in BMC Genomics, we note that there is increasing interest in the applications of -omics technologies to the understanding of diseases, as a forerunner to personalized genomic medicine.

  11. Promoting synergistic research and education in genomics and bioinformatics

    PubMed Central

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology. High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  12. 2K09 and thereafter : the coming era of integrative bioinformatics, systems biology and intelligent computing for functional genomics and personalized medicine research.

    PubMed

    Yang, Jack Y; Niemierko, Andrzej; Bajcsy, Ruzena; Xu, Dong; Athey, Brian D; Zhang, Aidong; Ersoy, Okan K; Li, Guo-Zheng; Borodovsky, Mark; Zhang, Joe C; Arabnia, Hamid R; Deng, Youping; Dunker, A Keith; Liu, Yunlong; Ghafoor, Arif

    2010-12-01

    Significant interest exists in establishing synergistic research in bioinformatics, systems biology and intelligent computing. Supported by the United States National Science Foundation (NSF), International Society of Intelligent Biological Medicine (http://www.ISIBM.org), International Journal of Computational Biology and Drug Design (IJCBDD) and International Journal of Functional Informatics and Personalized Medicine, the ISIBM International Joint Conferences on Bioinformatics, Systems Biology and Intelligent Computing (ISIBM IJCBS 2009) attracted more than 300 papers and 400 researchers and medical doctors world-wide. It was the only inter/multidisciplinary conference aimed to promote synergistic research and education in bioinformatics, systems biology and intelligent computing. The conference committee was very grateful for the valuable advice and suggestions from honorary chairs, steering committee members and scientific leaders including Dr. Michael S. Waterman (USC, Member of United States National Academy of Sciences), Dr. Chih-Ming Ho (UCLA, Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Wing H. Wong (Stanford, Member of United States National Academy of Sciences), Dr. Ruzena Bajcsy (UC Berkeley, Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Qu Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Andrzej Niemierko (Harvard), Dr. A. Keith Dunker (Indiana), Dr. Brian D. Athey (Michigan), Dr. Weida Tong (FDA, United States Department of Health and Human Services), Dr. Cathy H. Wu (Georgetown), Dr. Dong Xu (Missouri), Drs. Arif Ghafoor and Okan K Ersoy (Purdue), Dr. Mark Borodovsky (Georgia Tech, President of ISIBM), Dr. Hamid R. Arabnia (UGA, Vice-President of ISIBM), and other scientific leaders. The committee presented the 2009 ISIBM Outstanding Achievement Awards to Dr. Joydeep Ghosh (UT

  13. 2K09 and thereafter : the coming era of integrative bioinformatics, systems biology and intelligent computing for functional genomics and personalized medicine research

    PubMed Central

    2010-01-01

    Significant interest exists in establishing synergistic research in bioinformatics, systems biology and intelligent computing. Supported by the United States National Science Foundation (NSF), International Society of Intelligent Biological Medicine (http://www.ISIBM.org), International Journal of Computational Biology and Drug Design (IJCBDD) and International Journal of Functional Informatics and Personalized Medicine, the ISIBM International Joint Conferences on Bioinformatics, Systems Biology and Intelligent Computing (ISIBM IJCBS 2009) attracted more than 300 papers and 400 researchers and medical doctors world-wide. It was the only inter/multidisciplinary conference aimed to promote synergistic research and education in bioinformatics, systems biology and intelligent computing. The conference committee was very grateful for the valuable advice and suggestions from honorary chairs, steering committee members and scientific leaders including Dr. Michael S. Waterman (USC, Member of United States National Academy of Sciences), Dr. Chih-Ming Ho (UCLA, Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Wing H. Wong (Stanford, Member of United States National Academy of Sciences), Dr. Ruzena Bajcsy (UC Berkeley, Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Qu Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Andrzej Niemierko (Harvard), Dr. A. Keith Dunker (Indiana), Dr. Brian D. Athey (Michigan), Dr. Weida Tong (FDA, United States Department of Health and Human Services), Dr. Cathy H. Wu (Georgetown), Dr. Dong Xu (Missouri), Drs. Arif Ghafoor and Okan K Ersoy (Purdue), Dr. Mark Borodovsky (Georgia Tech, President of ISIBM), Dr. Hamid R. Arabnia (UGA, Vice-President of ISIBM), and other scientific leaders. The committee presented the 2009 ISIBM Outstanding Achievement Awards to Dr. Joydeep Ghosh (UT

  14. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Cancer.gov

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  15. Bioinformatics for cancer management in the post-genome era.

    PubMed

    Katoh, Masuko; Katoh, Masaru

    2006-04-01

    Human cancer is caused by multiple factors, such as genetic predisposition, chronic persistent inflammation, environmental factors, life style, and aging. Dysregulated proliferation, dysregulated adhesion, resistance to apoptosis, resistance to senescence, and resistance to anti-cancer drugs are features of cancer cells. Accumulation of multiple epigenetic changes and genetic alterations of cancer-associated genes during multi-stage carcinogenesis results in more malignant phenotypes. Post-genome science is characterized by omics data related to genome, transcriptome, proteome, metabolome, interactome, and epigenome as well as by high-throughput technology, such as whole-genome tiling oligonucleotide array, array CGH with 32,433 overlapping BAC clones, transcriptome microarray, mass spectrometry, tissue-based expression array, and cell-based transfection array. Benchtop oncology supplies Desktop oncology with large amounts of omics data produced by high-throughput technology. Desktop oncology establishes knowledge on cancer-related biomarkers, such as predisposition markers, diagnostic markers, prognostic markers, and therapeutic markers, by using bioinformatics and human intelligence of experts for data mining and text mining. Bedside oncology applies the knowledge established by Desktop oncology to determine therapeutics for cancer patients. Antibody drugs (Trastuzumab/Herceptin, Cetuximab/Erbitux, Bevacizumab/Avastin, et cetera), small molecule inhibitors for tyrosine kinases (Gefitinib/Iressa, Erlotinib/Tarceva, Imatinib/Gleevec, et cetera), conventional cytotoxic drugs, and anti-hormonal drugs are used for cancer chemotherapy. Biomarker monitoring contributes to therapeutic optional choice and drug dosage determination for cancer patients. Knowledge on biomarkers is feedforwarded from desktop to bedside in the translational research, and then biomarker monitoring is feedbacked from bedside to desktop in the reverse translational research. Desktop oncology is

  16. The bioinformatics of psychosocial genomics in alternative and complementary medicine.

    PubMed

    Rossi, E

    2003-06-01

    The bioinformatics of alternative and complementary medicine is outlined in 3 hypotheses that extend the molecular-genomic revolution initiated by Watson and Crick 50 years ago to include psychology in the new discipline of psychosocial and cultural genomics. Stress-induced changes in the alternative splicing of genes demonstrate how psychosomatic stress in humans modulates activity-dependent gene expression, protein formation, physiological function, and psychological experience. The molecular messengers generated by stress, injury, and disease can activate immediate early genes within stem cells so that they then signal the target genes required to synthesize the proteins that will transform (differentiate) stem cells into mature well-functioning tissues. Such activity-dependent gene expression and its consequent activity-dependent neurogenesis and stem cell healing is proposed as the molecular-genomic-cellular basis of rehabilitative medicine, physical, and occupational therapy as well as the many alternative and complementary approaches to mind-body healing. The therapeutic replaying of enriching life experiences that evoke the novelty-numinosum-neurogenesis effect during creative moments of art, music, dance, drama, humor, literature, poetry, and spirituality, as well as cultural rituals of life transitions (birth, puberty, marriage, illness, healing, and death) can optimize consciousness, personal relationships, and healing in a manner that has much in common with the psychogenomic foundations of naturalistic and complementary medicine. The entire history of alternative and complementary approaches to healing is consistent with this new neuroscience world view about the role of psychological arousal and fascination in modulating gene expression, neurogenesis, and healing via the psychosocial and cultural rites of human societies.

  17. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    SciTech Connect

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics into

  18. Personal genomics services: whose genomes?

    PubMed Central

    Gurwitz, David; Bregman-Eschet, Yael

    2009-01-01

    New companies offering personal whole-genome information services over the internet are dynamic and highly visible players in the personal genomics field. For fees currently ranging from US$399 to US$2500 and a vial of saliva, individuals can now purchase online access to their individual genetic information regarding susceptibility to a range of chronic diseases and phenotypic traits based on a genome-wide SNP scan. Most of the companies offering such services are based in the United States, but their clients may come from nearly anywhere in the world. Although the scientific validity, clinical utility and potential future implications of such services are being hotly debated, several ethical and regulatory questions related to direct-to-consumer (DTC) marketing strategies of genetic tests have not yet received sufficient attention. For example, how can we minimize the risk of unauthorized third parties from submitting other people's DNA for testing? Another pressing question concerns the ownership of (genotypic and phenotypic) information, as well as the unclear legal status of customers regarding their own personal information. Current legislation in the US and Europe falls short of providing clear answers to these questions. Until the regulation of personal genomics services catches up with the technology, we call upon commercial providers to self-regulate and coordinate their activities to minimize potential risks to individual privacy. We also point out some specific steps, along the trustee model, that providers of DTC personal genomics services as well as regulators and policy makers could consider for addressing some of the concerns raised below. PMID:19259127

  19. Recent developments in genomics, bioinformatics and drug discovery to combat emerging drug-resistant tuberculosis.

    PubMed

    Swaminathan, Soumya; Sundaramurthi, Jagadish Chandrabose; Palaniappan, Alangudi Natarajan; Narayanan, Sujatha

    2016-12-01

    Emergence of drug-resistant tuberculosis (DR-TB) is a big challenge in TB control. The delay in diagnosis of DR-TB leads to its increased transmission, and therefore prevalence. Recent developments in genomics have enabled whole genome sequencing (WGS) of Mycobacterium tuberculosis (M. tuberculosis) from 3-day-old liquid culture and directly from uncultured sputa, while new bioinformatics tools facilitate to determine DR mutations rapidly from the resulting sequences. The present drug discovery and development pipeline is filled with candidate drugs which have shown efficacy against DR-TB. Furthermore, some of the FDA-approved drugs are being evaluated for repurposing, and this approach appears promising as several drugs are reported to enhance efficacy of the standard TB drugs, reduce drug tolerance, or modulate the host immune response to control the growth of intracellular M. tuberculosis. Recent developments in genomics and bioinformatics along with new drug discovery collectively have the potential to result in synergistic impact leading to the development of a rapid protocol to determine the drug resistance profile of the infecting strain so as to provide personalized medicine. Hence, in this review, we discuss recent developments in WGS, bioinformatics and drug discovery to perceive how they would transform the management of tuberculosis in a timely manner.

  20. A Required Course in Human Genomics, Pharmacogenomics, and Bioinformatics

    PubMed Central

    Brazeau, Daniel A.; Brazeau, Gayle A.

    2006-01-01

    Objectives To provide students with an understanding of the principles and applications of human genetics and genomics in drug therapy optimization, patient care, and counseling. Design A 2-credit hour course entitled Principles of the Human Genome, Pharmacogenomics, and Bioinformatics was offered to third-professional year PharmD students. Written examinations, in-class exercises, and a written paper evaluating the current literature were used to evaluate student learning. Assessment Student course ratings on the pedagogical format of the course and the relevance of course material to professional practice have improved significantly since first implementation in 2002. Conclusion This course provided pharmacy students with an understanding of pharmacogenetics ranging from genetic principles and the inheritance of complex traits to specific examples of pharmacogenomics in drug therapy. PMID:17332851

  1. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    PubMed Central

    2011-01-01

    Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System) is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys. PMID:21276275

  2. PGWD: Integrating Personal Genome for Warfarin Dosing.

    PubMed

    Pan, Yidan; Cheng, Ronghai; Li, Zhoufang; Zhao, Yujun; He, Jiankui

    2016-03-01

    Warfarin is a drug normally used in the prevention of thrombosis and the formation of blood clots. The dosage of warfarin is strongly affected by genetic variants of CYP2C9 and VKORC1 genes. Current technologies for detecting the variants of these genes are mainly based on real-time PCR. In recent years, due to the rapidly dropping cost of whole genome sequencing and genotyping, more and more people get their whole genome sequenced or genotyped. However, current software for warfarin dosing prediction is based on low-throughput genetic information from either real-time PCR or melting curve methods. There is no bioinformatics tool available that can take the high-throughput genome sequencing data as input and determine the accurate dosage of warfarin. Here, we present PGWD, a web tool that analyzes personal genome sequencing data and integrates with clinical information for warfarin dosing.

  3. Recent progress on bioinformatics, functional genomics, and metabolomics research of cytochrome P450 and its impact on drug discovery.

    PubMed

    Zhang, Tao; Zhao, Mingzhu; Pang, Yushu; Zhang, Wen; Angela Liu, Limin; Wei, Dong-Qing

    2012-01-01

    The cytochrome P450 superfamily is responsible primarily for human drug metabolism, which is of critical importance for the drug discovery and development. Rapid advancement of bioinformatics, functional genomics and metabolomics has been made over the last decade. These disciplines are essential in target identification, lead discovery and optimization. In this review, we summarize the recent progress on cytochrome P450 and its role on drug metabolism in the context of bioinformatics, functional genomics and metabolomics. Data are integrated into various databases and web-based platforms on cytochrome P450. These research tools and resources are playing an increasingly important role in drug discovery, and are helping in achieving the ultimate goal of personalized medicine, that is, to prescribe personalized drugs according to each person's genetic makeup, metabolic level, and drug disposition.

  4. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud

    PubMed Central

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Background Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. Results We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. Conclusions This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and

  5. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  6. Genomics, proteomics and bioinformatics of human heart failure

    PubMed Central

    DOS REMEDIOS, C.G.; LIEW, C.C.; ALLEN, P.D.; WINSLOW, R.L.; VAN EYK, J.E.; DUNN, M.J.

    2005-01-01

    Unraveling the molecular complexities of human heart failure, particularly end-stage failure, can be achieved by combining multiple investigative approaches. There are several parts to the problem. Each patient is the product of a complex set of genetic variations, different degrees of influence of diets and lifestyles, and usually heart transplantation patients are treated with multiple drugs. The genomic status of the myocardium of any one transplant patient can be analysed using gene arrays (cDNA- or oligonucleotide-based) each with its own strengths and weaknesses. The proteins expressed by these failing hearts (myocardial proteomics) were first investigated over a decade ago using two-dimensional polyacrylamide gel electrophoresis (2DGE) which promised to resolve several thousand proteins in a single sample of failing heart. However, while 2DGE is very successful for the abundant and moderately expressed proteins, it struggles to identify proteins expressed at low levels. Highly focused first dimension separations combined with recent advances in mass spectrometry now provide new hope for solving this difficulty. Protein arrays are a more recent form of proteomics that hold great promise but, like the above methods, they have their own drawbacks. Our approach to solving the problems inherent in the genomics and proteomics of heart failure is to provide experts in each analytical method with a sample from the same human failing heart. This requires a sufficiently large number of samples from a sufficiently large pool of heart transplant patients as well as a large pool of non-diseased, non-failing human hearts. We have collected more than 200 hearts from patients undergoing heart transplantations and a further 50 non-failing hearts. By combining our expertise we expect to reduce and possibly eliminate the inherent difficulties of each analytical approach. Finally, we recognise the need for bioinformatics to make sense of the large quantities of data that will

  7. Computational Challenges of Personal Genomics

    PubMed Central

    Bolouri, Hamid

    2008-01-01

    It is widely predicted that cost and efficiency gains in sequencing will usher in an era of personal genomics and personalized, predictive, preventive, and participatory medicine within a decade. I review the computational challenges ahead and propose general and specific directions for research and development. There is an urgent need to develop semantic ontologies that span genomics, molecular systems biology, and medical data. Although the development of such ontologies would be costly and difficult, the benefits will far outweigh the costs. I argue that availability of such ontologies would allow a revolution in web-services for personal genomics and medicine. PMID:19440448

  8. Evolutionary genomics of animal personality.

    PubMed

    van Oers, Kees; Mueller, Jakob C

    2010-12-27

    Research on animal personality can be approached from both a phenotypic and a genetic perspective. While using a phenotypic approach one can measure present selection on personality traits and their combinations. However, this approach cannot reconstruct the historical trajectory that was taken by evolution. Therefore, it is essential for our understanding of the causes and consequences of personality diversity to link phenotypic variation in personality traits with polymorphisms in genomic regions that code for this trait variation. Identifying genes or genome regions that underlie personality traits will open exciting possibilities to study natural selection at the molecular level, gene-gene and gene-environment interactions, pleiotropic effects and how gene expression shapes personality phenotypes. In this paper, we will discuss how genome information revealed by already established approaches and some more recent techniques such as high-throughput sequencing of genomic regions in a large number of individuals can be used to infer micro-evolutionary processes, historical selection and finally the maintenance of personality trait variation. We will do this by reviewing recent advances in molecular genetics of animal personality, but will also use advanced human personality studies as case studies of how molecular information may be used in animal personality research in the near future.

  9. FDA Bioinformatics Tool for Microbial Genomics Research on Molecular Characterization of Bacterial Foodborne Pathogens Using Microarrays

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed the genomics tool ArrayTrackTM, which provides extensive functionalities to man...

  10. Analysing the performance of personal computers based on Intel microprocessors for sequence aligning bioinformatics applications.

    PubMed

    Nair, Pradeep S; John, Eugene B

    2007-01-01

    Aligning specific sequences against a very large number of other sequences is a central aspect of bioinformatics. With the widespread availability of personal computers in biology laboratories, sequence alignment is now often performed locally. This makes it necessary to analyse the performance of personal computers for sequence aligning bioinformatics benchmarks. In this paper, we analyse the performance of a personal computer for the popular BLAST and FASTA sequence alignment suites. Results indicate that these benchmarks have a large number of recurring operations and use memory operations extensively. It seems that the performance can be improved with a bigger L1-cache.

  11. Genomics Politics through Space and Time: The Case of Bioinformatics in Brazil.

    PubMed

    Bicudo, Edison

    2016-01-01

    The emergence of scientific disciplines, as well as the policies aimed to steer them, have geographical implications. This becomes visible in areas such as genomics and related fields. In this paper, the relation between scientific evolution, political decisions and geographical configuration is studied. The recent formation of bioinformatics in Brazil is focused on. The study involves an analysis of data collected on the website of CNPq, a funding agency attached to the Ministry of Science and Technology. Furthermore, I conducted fieldwork in four cities, interviewing 15 bioinformaticians. In the history of Brazilian bioinformatics, three periods can be identified. In the first period (1900-1996), bioinformatics was actually absent, but biology research groups were formed which would subsequently explore bioinformatics. The second period (1997-2006) was marked by the emergence of the discipline and geographical concentration of major research groups in the southern part of Brazil. A third period can be pointed to (2007-2014), in which political choices have turned geographical diffusion and institutional equality into a national target. As a consequence of the recent shifts, genomics and bioinformatics researchers have been involved in a debate, some defending the existence of few specialized research and sequencing platforms, whereas others welcoming the constitution of a scientific scenario based on decentralized platforms. I defend an intermediate solution, whereby some places would be selected to be genomics hubs. This would fit the regional diversity of this vast country, in addition to tackling the scientific weaknesses of the northern area.

  12. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    PubMed

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  13. Bioinformatics tools and databases for whole genome sequence analysis of Mycobacterium tuberculosis.

    PubMed

    Faksri, Kiatichai; Tan, Jun Hao; Chaiprasert, Angkana; Teo, Yik-Ying; Ong, Rick Twee-Hee

    2016-11-01

    Tuberculosis (TB) is an infectious disease of global public health importance caused by Mycobacterium tuberculosis complex (MTC) in which M. tuberculosis (Mtb) is the major causative agent. Recent advancements in genomic technologies such as next generation sequencing have enabled high throughput cost-effective generation of whole genome sequence information from Mtb clinical isolates, providing new insights into the evolution, genomic diversity and transmission of the Mtb bacteria, including molecular mechanisms of antibiotic resistance. The large volume of sequencing data generated however necessitated effective and efficient management, storage, analysis and visualization of the data and results through development of novel and customized bioinformatics software tools and databases. In this review, we aim to provide a comprehensive survey of the current freely available bioinformatics software tools and publicly accessible databases for genomic analysis of Mtb for identifying disease transmission in molecular epidemiology and in rapid determination of the antibiotic profiles of clinical isolates for prompt and optimal patient treatment.

  14. IFPA meeting 2016 workshop report I: Genomic communication, bioinformatics, trophoblast biology and transport systems.

    PubMed

    Albrecht, Christiane; Baker, Julie C; Blundell, Cassidy; Chavez, Shawn L; Carbone, Lucia; Chamley, Larry; Hannibal, Roberta L; Illsley, Nick; Kurre, Peter; Laurent, Louise C; McKenzie, Charles; Morales-Prieto, Diana; Pantham, Priyadarshini; Paquette, Alison; Powell, Katie; Price, Nathan; Rao, Balaji M; Sadovsky, Yoel; Salomon, Carlos; Tuteja, Geetu; Wilson, Samantha; O'Tierney-Ginn, P F

    2017-01-11

    Workshops are an important part of the IFPA annual meeting as they allow for discussion of specialized topics. At IFPA meeting 2016 there were twelve themed workshops, four of which are summarized in this report. These workshops covered innovative technologies applied to new and traditional areas of placental research: 1) genomic communication; 2) bioinformatics; 3) trophoblast biology and pathology; 4) placental transport systems.

  15. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    ERIC Educational Resources Information Center

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  16. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  17. FLAGdb(++): A Bioinformatic Environment to Study and Compare Plant Genomes.

    PubMed

    Tamby, Jean Philippe; Brunaud, Véronique

    2017-01-01

    Today, the growing knowledge and data accumulation on plant genomes do not solve in a simple way the task of gene function inference. Because data of different types are coming from various sources, we need to integrate and analyze them to help biologists in this task. We created FLAGdb(++) ( http://tools.ips2.u-psud.fr/FLAGdb ) to take up this challenge for a selection of plant genomes. In order to enrich gene function predictions, structural and functional annotations of the genomes are explored to generate meta-data and to compare them. Since data are numerous and complex, we focused on accessibility and visualization with an original and user-friendly interface. In this chapter we present the main tools of FLAGdb(++) and a use-case to explore a gene family: structural and functional properties of this family and research of orthologous genes in the other plant genomes.

  18. Exploring laccase genes from plant pathogen genomes: a bioinformatic approach.

    PubMed

    Feng, B Z; Li, P Q; Fu, L; Yu, X M

    2015-10-30

    To date, research on laccases has mostly been focused on plant and fungal laccases and their current use in biotechnological applications. In contrast, little is known about laccases from plant pathogens, although recent rapid progress in whole genome sequencing of an increasing number of organisms has facilitated their identification and ascertainment of their origins. In this study, a comparative analysis was performed to elucidate the distribution of laccases among bacteria, fungi, and oomycetes, and, through comparison of their amino acids, to determine the relationships between them. We retrieved the laccase genes for the 20 publicly available plant pathogen genomes. From these, 125 laccase genes were identified in total, including seven in bacterial genomes, 101 in fungal genomes, and 17 in oomycete genomes. Most of the predicted protein models of these genes shared typical fungal laccase characteristics, possessing four conserved domains with one cysteine and ten histidine residues at these domains. Phylogenetic analysis illustrated that laccases from bacteria and oomycetes were grouped into two distinct clades, whereas fungal laccases clustered in three main clades. These results provide the theoretical groundwork regarding the role of laccases in plant pathogens and might be used to guide future research into these enzymes.

  19. Silicon Era of Carbon-Based Life: Application of Genomics and Bioinformatics in Crop Stress Research

    PubMed Central

    Li, Man-Wah; Qi, Xinpeng; Ni, Meng; Lam, Hon-Ming

    2013-01-01

    Abiotic and biotic stresses lead to massive reprogramming of different life processes and are the major limiting factors hampering crop productivity. Omics-based research platforms allow for a holistic and comprehensive survey on crop stress responses and hence may bring forth better crop improvement strategies. Since high-throughput approaches generate considerable amounts of data, bioinformatics tools will play an essential role in storing, retrieving, sharing, processing, and analyzing them. Genomic and functional genomic studies in crops still lag far behind similar studies in humans and other animals. In this review, we summarize some useful genomics and bioinformatics resources available to crop scientists. In addition, we also discuss the major challenges and advancements in the “-omics” studies, with an emphasis on their possible impacts on crop stress research and crop improvement. PMID:23759993

  20. Controlling new knowledge: Genomic science, governance and the politics of bioinformatics.

    PubMed

    Salter, Brian; Salter, Charlotte

    2017-04-01

    The rise of bioinformatics is a direct response to the political difficulties faced by genomics in its quest to be a new biomedical innovation, and the value of bioinformatics lies in its role as the bridge between the promise of genomics and its realization in the form of health benefits. Western scientific elites are able to use their close relationship with the state to control and facilitate the emergence of new domains compatible with the existing distribution of epistemic power - all within the embrace of public trust. The incorporation of bioinformatics as the saviour of genomics had to be integrated with the operation of two key aspects of governance in this field: the definition and ownership of the new knowledge. This was achieved mainly by the development of common standards and by the promotion of the values of communality, open access and the public ownership of data to legitimize and maintain the governance power of publicly funded genomic science. Opposition from industry advocating the private ownership of knowledge has been largely neutered through the institutions supporting the science-state concordat. However, in order for translation into health benefits to occur and public trust to be assured, genomic and clinical data have to be integrated and knowledge ownership agreed upon across the separate and distinct governance territories of scientist, clinical medicine and society. Tensions abound as science seeks ways of maintaining its control of knowledge production through the negotiation of new forms of governance with the institutions and values of clinicians and patients.

  1. Public Access for Teaching Genomics, Proteomics, and Bioinformatics

    ERIC Educational Resources Information Center

    Campbell, A. Malcolm

    2003-01-01

    When the human genome project was conceived, its leaders wanted all researchers to have equal access to the data and associated research tools. Their vision of equal access provides an unprecedented teaching opportunity. Teachers and students have free access to the same databases that researchers are using. Furthermore, the recent movement to…

  2. VectorBase: improvements to a bioinformatics resource for invertebrate vector genomics.

    PubMed

    Megy, Karine; Emrich, Scott J; Lawson, Daniel; Campbell, David; Dialynas, Emmanuel; Hughes, Daniel S T; Koscielny, Gautier; Louis, Christos; Maccallum, Robert M; Redmond, Seth N; Sheehan, Andrew; Topalis, Pantelis; Wilson, Derek

    2012-01-01

    VectorBase (http://www.vectorbase.org) is a NIAID-supported bioinformatics resource for invertebrate vectors of human pathogens. It hosts data for nine genomes: mosquitoes (three Anopheles gambiae genomes, Aedes aegypti and Culex quinquefasciatus), tick (Ixodes scapularis), body louse (Pediculus humanus), kissing bug (Rhodnius prolixus) and tsetse fly (Glossina morsitans). Hosted data range from genomic features and expression data to population genetics and ontologies. We describe improvements and integration of new data that expand our taxonomic coverage. Releases are bi-monthly and include the delivery of preliminary data for emerging genomes. Frequent updates of the genome browser provide VectorBase users with increasing options for visualizing their own high-throughput data. One major development is a new population biology resource for storing genomic variations, insecticide resistance data and their associated metadata. It takes advantage of improved ontologies and controlled vocabularies. Combined, these new features ensure timely release of multiple types of data in the public domain while helping overcome the bottlenecks of bioinformatics and annotation by engaging with our user community.

  3. Tissue Banking, Bioinformatics, and Electronic Medical Records: The Front-End Requirements for Personalized Medicine

    PubMed Central

    Suh, K. Stephen; Sarojini, Sreeja; Youssif, Maher; Nalley, Kip; Milinovikj, Natasha; Elloumi, Fathi; Russell, Steven; Pecora, Andrew; Schecter, Elyssa; Goy, Andre

    2013-01-01

    Personalized medicine promises patient-tailored treatments that enhance patient care and decrease overall treatment costs by focusing on genetics and “-omics” data obtained from patient biospecimens and records to guide therapy choices that generate good clinical outcomes. The approach relies on diagnostic and prognostic use of novel biomarkers discovered through combinations of tissue banking, bioinformatics, and electronic medical records (EMRs). The analytical power of bioinformatic platforms combined with patient clinical data from EMRs can reveal potential biomarkers and clinical phenotypes that allow researchers to develop experimental strategies using selected patient biospecimens stored in tissue banks. For cancer, high-quality biospecimens collected at diagnosis, first relapse, and various treatment stages provide crucial resources for study designs. To enlarge biospecimen collections, patient education regarding the value of specimen donation is vital. One approach for increasing consent is to offer publically available illustrations and game-like engagements demonstrating how wider sample availability facilitates development of novel therapies. The critical value of tissue bank samples, bioinformatics, and EMR in the early stages of the biomarker discovery process for personalized medicine is often overlooked. The data obtained also require cross-disciplinary collaborations to translate experimental results into clinical practice and diagnostic and prognostic use in personalized medicine. PMID:23818899

  4. Genome-wide variant analysis of simplex autism families with an integrative clinical-bioinformatics pipeline

    PubMed Central

    Jiménez-Barrón, Laura T.; O'Rawe, Jason A.; Wu, Yiyang; Yoon, Margaret; Fang, Han; Iossifov, Ivan; Lyon, Gholson J.

    2015-01-01

    Autism spectrum disorders (ASDs) are a group of developmental disabilities that affect social interaction and communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASDs, in which many different loci are involved. Although many current population-scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole-genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de novo, autosomal recessive, X-linked, mitochondrial, and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous copy-number variations (CNVs), a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole-genome sequencing data can generate reliable results for use in downstream investigations. PMID:27148569

  5. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  6. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

    PubMed

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

    2015-04-01

    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  7. Integrating genomics, proteomics and bioinformatics in translational studies of molecular medicine.

    PubMed

    Ostrowski, Jerzy; Wyrwicz, Lucjan S

    2009-09-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, which are usually applied on the scale of single genes. Medicine in the postgenomic era will utilize thousands of disease-associated molecular markers provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical and bioinformatic analyses to model biological systems. Collecting, cataloging and comparing data from molecular studies, and the subsequent development of conclusions, creates the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm known as integrative genomics.

  8. Widening participation would be key in enhancing bioinformatics and genomics research in Africa

    PubMed Central

    Karikari, Thomas K.; Quansah, Emmanuel; Mohamed, Wael M.Y.

    2015-01-01

    Bioinformatics and genome science (BGS) are gradually gaining roots in Africa, contributing to studies that are leading to improved understanding of health, disease, agriculture and food security. While a few African countries have established foundations for research and training in these areas, BGS appear to be limited to only a few institutions in specific African countries. However, improving the disciplines in Africa will require pragmatic efforts to expand training and research partnerships to scientists in yet-unreached institutions. Here, we discuss the need to expand BGS programmes in Africa, and propose mechanisms to do so. PMID:26767163

  9. Personal genome sequencing: current approaches and challenges

    PubMed Central

    Snyder, Michael; Du, Jiang; Gerstein, Mark

    2010-01-01

    The revolution in DNA sequencing technologies has now made it feasible to determine the genome sequences of many individuals; i.e., “personal genomes.” Genome sequences of cells and tissues from both normal and disease states have been determined. Using current approaches, whole human genome sequences are not typically assembled and determined de novo, but, instead, variations relative to a reference sequence are identified. We discuss the current state of personal genome sequencing, the main steps involved in determining a genome sequence (i.e., identifying single-nucleotide polymorphisms [SNPs] and structural variations [SVs], assembling new sequences, and phasing haplotypes), and the challenges and performance metrics for evaluating the accuracy of the reconstruction. Finally, we consider the possible individual and societal benefits of personal genome sequences. PMID:20194435

  10. Bioinformatic Genome Comparisons for Taxonomic and Phylogenetic Assignments Using Aeromonas as a Test Case

    PubMed Central

    Colston, Sophie M.; Fullmer, Matthew S.; Beka, Lidia; Lamy, Brigitte

    2014-01-01

    ABSTRACT Prokaryotic taxonomy is the underpinning of microbiology, as it provides a framework for the proper identification and naming of organisms. The “gold standard” of bacterial species delineation is the overall genome similarity determined by DNA-DNA hybridization (DDH), a technically rigorous yet sometimes variable method that may produce inconsistent results. Improvements in next-generation sequencing have resulted in an upsurge of bacterial genome sequences and bioinformatic tools that compare genomic data, such as average nucleotide identity (ANI), correlation of tetranucleotide frequencies, and the genome-to-genome distance calculator, or in silico DDH (isDDH). Here, we evaluate ANI and isDDH in combination with phylogenetic studies using Aeromonas, a taxonomically challenging genus with many described species and several strains that were reassigned to different species as a test case. We generated improved, high-quality draft genome sequences for 33 Aeromonas strains and combined them with 23 publicly available genomes. ANI and isDDH distances were determined and compared to phylogenies from multilocus sequence analysis of housekeeping genes, ribosomal proteins, and expanded core genes. The expanded core phylogenetic analysis suggested relationships between distant Aeromonas clades that were inconsistent with studies using fewer genes. ANI values of ≥96% and isDDH values of ≥70% consistently grouped genomes originating from strains of the same species together. Our study confirmed known misidentifications, validated the recent revisions in the nomenclature, and revealed that a number of genomes deposited in GenBank are misnamed. In addition, two strains were identified that may represent novel Aeromonas species. PMID:25406383

  11. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    NASA Astrophysics Data System (ADS)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  12. A critical analysis of assessment quality in genomics and bioinformatics education research.

    PubMed

    Campbell, Chad E; Nehm, Ross H

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (<10%) of studies provided any reliability or validity evidence, and only one study out of the 95 sampled mentioned both validity and reliability. Our findings raise concerns about the quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE.

  13. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    PubMed Central

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (<10%) of studies provided any reliability or validity evidence, and only one study out of the 95 sampled mentioned both validity and reliability. Our findings raise concerns about the quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE. PMID:24006400

  14. Neurogenomics: An opportunity to integrate neuroscience, genomics and bioinformatics research in Africa

    PubMed Central

    Karikari, Thomas K.; Aleksic, Jelena

    2015-01-01

    Modern genomic approaches have made enormous contributions to improving our understanding of the function, development and evolution of the nervous system, and the diversity within and between species. However, most of these research advances have been recorded in countries with advanced scientific resources and funding support systems. On the contrary, little is known about, for example, the possible interplay between different genes, non-coding elements and environmental factors in modulating neurological diseases among populations in low-income countries, including many African countries. The unique ancestry of African populations suggests that improved inclusion of these populations in neuroscience-related genomic studies would significantly help to identify novel factors that might shape the future of neuroscience research and neurological healthcare. This perspective is strongly supported by the recent identification that diseased individuals and their kindred from specific sub-Saharan African populations lack common neurological disease-associated genetic mutations. This indicates that there may be population-specific causes of neurological diseases, necessitating further investigations into the contribution of additional, presently-unknown genomic factors. Here, we discuss how the development of neurogenomics research in Africa would help to elucidate disease-related genomic variants, and also provide a good basis to develop more effective therapies. Furthermore, neurogenomics would harness African scientists' expertise in neuroscience, genomics and bioinformatics to extend our understanding of the neural basis of behaviour, development and evolution. PMID:26937352

  15. Edge Bioinformatics

    SciTech Connect

    Lo, Chien-Chi

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance

  16. Basics of Genome Sequence Analysis in Bioinformatics -- its Fundamental Ideas and Problems

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2009-02-01

    The genome sequences are one of the most fundamental data among various omics analyses. So far, basic bioinformatics tools have developing to treat genome sequences. First step of genome sequence analysis is to predict or assign "genes" on genome sequences. In the case of Eukaryotes, we can identify genes by use of full length cDNA sequences with local alignment tools such as search, blast and fasta, etc. However, it is difficult to catch mRNAs (transcripts) in Prokaryotes. Therefore, computational prediction for gene identification is first choice to start genome sequence analysis. In this review, we pick up methods for computational gene prediction first. Once genes are predicted, next step is to functions for proteins or RNAs encoded on a gene. Then, how we can define the distance between gene sequences is very important for the further analysis. So, we describe the basics of mathematical concept for gene comparison. And we also introduce our novel concept for biological sequence comparisons for the view point of informational theory. In the post genome era, many researchers are very interested in not only gene functions but also the gene regulations whose information is also on genome sequences. Cis-regulatory elements, however, is too short to find some mathematical rules. Therefore, computationally predicted cis-elements tend to include many false-positives. To reduce the ratio false-positives, we need reliable database of set of cis-regulatory elements called cis-regulatory modules for a gene. So, we are trying to develop the Cis-Regulatory Elements Module Reference Database. In the third section, we introduce you the procedure to construct the Cis-Regulatory Elements Module Reference Database and its user interfaces.

  17. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform

    DOE PAGES

    Li, Po-E; Lo, Chien -Chi; Anderson, Joseph J.; ...

    2016-11-24

    Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the easemore » of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. As a result, this bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research.« less

  18. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform

    SciTech Connect

    Li, Po-E; Lo, Chien -Chi; Anderson, Joseph J.; Davenport, Karen W.; Bishop-Lilly, Kimberly A.; Xu, Yan; Ahmed, Sanaa; Feng, Shihai; Mokashi, Vishwesh P.; Chain, Patrick S. G.

    2016-11-24

    Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the ease of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. As a result, this bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research.

  19. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform

    PubMed Central

    Li, Po-E; Lo, Chien-Chi; Anderson, Joseph J.; Davenport, Karen W.; Bishop-Lilly, Kimberly A.; Xu, Yan; Ahmed, Sanaa; Feng, Shihai; Mokashi, Vishwesh P.; Chain, Patrick S.G.

    2017-01-01

    Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the ease of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. This bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research. PMID:27899609

  20. Computational biology of genome expression and regulation--a review of microarray bioinformatics.

    PubMed

    Wang, Junbai

    2008-01-01

    Microarray technology is being used widely in various biomedical research areas; the corresponding microarray data analysis is an essential step toward the best utilizing of array technologies. Here we review two components of the microarray data analysis: a low level of microarray data analysis that emphasizes the designing, the quality control, and the preprocessing of microarray experiments, then a high level of microarray data analysis that focuses on the domain-specific microarray applications such as tumor classification, biomarker prediction, analyzing array CGH experiments, and reverse engineering of gene expression networks. Additionally, we will review the recent development of building a predictive model in genome expression and regulation studies. This review may help biologists grasp a basic knowledge of microarray bioinformatics as well as its potential impact on the future evolvement of biomedical research fields.

  1. [Ethical issues in personal genome research].

    PubMed

    Kato, Kazuto; Minari, Jusaku

    2013-03-01

    The rapid expansion of techniques for studying human genomics has remarkably changed research and practice. It is expected that more progress will be made in the field of medical and biological research owing to the technological advances. Genomics researchers collect human genetic material, including DNA and cells, from a large number of individuals and carry out "personal genome analysis"; as a result, new types of ethical, legal, and social issues (ELSI) have arisen, including issues such as informed consent procedures, data sharing, protection of genetic information, and return of research results. To address these issues, many large research projects have established specialist groups that are devoted to manage ELSI of their research. The guidelines for genomics research set by the government are also expected to be revised accordingly. In this paper, we present an overview of ELSI of personal genome research and discuss necessary measures to tackle these issues.

  2. Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine

    PubMed Central

    Xiao, Wenming; Wu, Leihong; Yavas, Gokhan; Simonyan, Vahan; Ning, Baitang; Hong, Huixiao

    2016-01-01

    -response, tailoring drug therapy and detecting tumors. We believe the precision medicine would largely benefit from bioinformatics solutions, particularly for personal genome assembly. PMID:27110816

  3. The Translational Genomics Core at Partners Personalized Medicine: Facilitating the Transition of Research towards Personalized Medicine.

    PubMed

    Blau, Ashley; Brown, Alison; Mahanta, Lisa; Amr, Sami S

    2016-02-26

    The Translational Genomics Core (TGC) at Partners Personalized Medicine (PPM) serves as a fee-for-service core laboratory for Partners Healthcare researchers, providing access to technology platforms and analysis pipelines for genomic, transcriptomic, and epigenomic research projects. The interaction of the TGC with various components of PPM provides it with a unique infrastructure that allows for greater IT and bioinformatics opportunities, such as sample tracking and data analysis. The following article describes some of the unique opportunities available to an academic research core operating within PPM, such the ability to develop analysis pipelines with a dedicated bioinformatics team and maintain a flexible Laboratory Information Management System (LIMS) with the support of an internal IT team, as well as the operational challenges encountered to respond to emerging technologies, diverse investigator needs, and high staff turnover. In addition, the implementation and operational role of the TGC in the Partners Biobank genotyping project of over 25,000 samples is presented as an example of core activities working with other components of PPM.

  4. The Translational Genomics Core at Partners Personalized Medicine: Facilitating the Transition of Research towards Personalized Medicine

    PubMed Central

    Blau, Ashley; Brown, Alison; Mahanta, Lisa; Amr, Sami S.

    2016-01-01

    The Translational Genomics Core (TGC) at Partners Personalized Medicine (PPM) serves as a fee-for-service core laboratory for Partners Healthcare researchers, providing access to technology platforms and analysis pipelines for genomic, transcriptomic, and epigenomic research projects. The interaction of the TGC with various components of PPM provides it with a unique infrastructure that allows for greater IT and bioinformatics opportunities, such as sample tracking and data analysis. The following article describes some of the unique opportunities available to an academic research core operating within PPM, such the ability to develop analysis pipelines with a dedicated bioinformatics team and maintain a flexible Laboratory Information Management System (LIMS) with the support of an internal IT team, as well as the operational challenges encountered to respond to emerging technologies, diverse investigator needs, and high staff turnover. In addition, the implementation and operational role of the TGC in the Partners Biobank genotyping project of over 25,000 samples is presented as an example of core activities working with other components of PPM. PMID:26927185

  5. Empowered genome community: leveraging a bioinformatics platform as a citizen-scientist collaboration tool.

    PubMed

    Wendelsdorf, Katherine; Shah, Sohela

    2015-09-01

    There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS) technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy - to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC) is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen-scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public.

  6. ARG-ANNOT, a New Bioinformatic Tool To Discover Antibiotic Resistance Genes in Bacterial Genomes

    PubMed Central

    Gupta, Sushim Kumar; Padmanabhan, Babu Roshan; Diene, Seydina M.; Lopez-Rojas, Rafael; Kempf, Marie; Landraud, Luce

    2014-01-01

    ARG-ANNOT (Antibiotic Resistance Gene-ANNOTation) is a new bioinformatic tool that was created to detect existing and putative new antibiotic resistance (AR) genes in bacterial genomes. ARG-ANNOT uses a local BLAST program in Bio-Edit software that allows the user to analyze sequences without a Web interface. All AR genetic determinants were collected from published works and online resources; nucleotide and protein sequences were retrieved from the NCBI GenBank database. After building a database that includes 1,689 antibiotic resistance genes, the software was tested in a blind manner using 100 random sequences selected from the database to verify that the sensitivity and specificity were at 100% even when partial sequences were queried. Notably, BLAST analysis results obtained using the rmtF gene sequence (a new aminoglycoside-modifying enzyme gene sequence that is not included in the database) as a query revealed that the tool was able to link this sequence to short sequences (17 to 40 bp) found in other genes of the rmt family with significant E values. Finally, the analysis of 178 Acinetobacter baumannii and 20 Staphylococcus aureus genomes allowed the detection of a significantly higher number of AR genes than the Resfinder gene analyzer and 11 point mutations in target genes known to be associated with AR. The average time for the analysis of a genome was 3.35 ± 0.13 min. We have created a concise database for BLAST using a Bio-Edit interface that can detect AR genetic determinants in bacterial genomes and can rapidly and easily discover putative new AR genetic determinants. PMID:24145532

  7. Using Informatics-, Bioinformatics- and Genomics-Based Approaches for the Molecular Surveillance and Detection of Biothreat Agents

    NASA Astrophysics Data System (ADS)

    Seto, Donald

    The convergence and wealth of informatics, bioinformatics and genomics methods and associated resources allow a comprehensive and rapid approach for the surveillance and detection of bacterial and viral organisms. Coupled with the continuing race for the fastest, most cost-efficient and highest-quality DNA sequencing technology, that is, "next generation sequencing", the detection of biological threat agents by `cheaper and faster' means is possible. With the application of improved bioinformatic tools for the understanding of these genomes and for parsing unique pathogen genome signatures, along with `state-of-the-art' informatics which include faster computational methods, equipment and databases, it is feasible to apply new algorithms to biothreat agent detection. Two such methods are high-throughput DNA sequencing-based and resequencing microarray-based identification. These are illustrated and validated by two examples involving human adenoviruses, both from real-world test beds.

  8. Genomes, Populations and Diseases: Ethnic Genomics and Personalized Medicine

    PubMed Central

    Stepanov, V.A.

    2010-01-01

    This review discusses the progress of ethnic genetics, the genetics of common diseases, and the concepts of personalized medicine. We show the relationship between the structure of genetic diversity in human populations and the varying frequencies of Mendelian and multifactor diseases. We also examine the population basis of pharmacogenetics and evaluate the effectiveness of pharmacotherapy, along with a review of new achievements and prospects in personalized genomics. PMID:22649660

  9. The Human Genome Project, and recent advances in personalized genomics.

    PubMed

    Wilson, Brenda J; Nicholls, Stuart G

    2015-01-01

    The language of "personalized medicine" and "personal genomics" has now entered the common lexicon. The idea of personalized medicine is the integration of genomic risk assessment alongside other clinical investigations. Consistent with this approach, testing is delivered by health care professionals who are not medical geneticists, and where results represent risks, as opposed to clinical diagnosis of disease, to be interpreted alongside the entirety of a patient's health and medical data. In this review we consider the evidence concerning the application of such personalized genomics within the context of population screening, and potential implications that arise from this. We highlight two general approaches which illustrate potential uses of genomic information in screening. The first is a narrowly targeted approach in which genetic profiling is linked with standard population-based screening for diseases; the second is a broader targeting of variants associated with multiple single gene disorders, performed opportunistically on patients being investigated for unrelated conditions. In doing so we consider the organization and evaluation of tests and services, the challenge of interpretation with less targeted testing, professional confidence, barriers in practice, and education needs. We conclude by discussing several issues pertinent to health policy, namely: avoiding the conflation of genetics with biological determinism, resisting the "technological imperative", due consideration of the organization of screening services, the need for professional education, as well as informed decision making and public understanding.

  10. Personal genomes in progress: from the human genome project to the personal genome project.

    PubMed

    Lunshof, Jeantine E; Bobe, Jason; Aach, John; Angrist, Misha; Thakuria, Joseph V; Vorhaus, Daniel B; Hoehe, Margret R; Church, George M

    2010-01-01

    The cost of a diploid human genome sequence has dropped from about $70M to $2000 since 2007--even as the standards for redundancy have increased from 7x to 40x in order to improve call rates. Coupled with the low return on investment for common single-nucleotide polylmorphisms, this has caused a significant rise in interest in correlating genome sequences with comprehensive environmental and trait data (GET). The cost of electronic health records, imaging, and microbial, immunological, and behavioral data are also dropping quickly. Sharing such integrated GET datasets and their interpretations with a diversity of researchers and research subjects highlights the need for informed-consent models capable of addressing novel privacy and other issues, as well as for flexible data-sharing resources that make materials and data available with minimum restrictions on use. This article examines the Personal Genome Project's effort to develop a GET database as a public genomics resource broadly accessible to both researchers and research participants, while pursuing the highest standards in research ethics.

  11. The Human Genome Project, and recent advances in personalized genomics

    PubMed Central

    Wilson, Brenda J; Nicholls, Stuart G

    2015-01-01

    The language of “personalized medicine” and “personal genomics” has now entered the common lexicon. The idea of personalized medicine is the integration of genomic risk assessment alongside other clinical investigations. Consistent with this approach, testing is delivered by health care professionals who are not medical geneticists, and where results represent risks, as opposed to clinical diagnosis of disease, to be interpreted alongside the entirety of a patient’s health and medical data. In this review we consider the evidence concerning the application of such personalized genomics within the context of population screening, and potential implications that arise from this. We highlight two general approaches which illustrate potential uses of genomic information in screening. The first is a narrowly targeted approach in which genetic profiling is linked with standard population-based screening for diseases; the second is a broader targeting of variants associated with multiple single gene disorders, performed opportunistically on patients being investigated for unrelated conditions. In doing so we consider the organization and evaluation of tests and services, the challenge of interpretation with less targeted testing, professional confidence, barriers in practice, and education needs. We conclude by discussing several issues pertinent to health policy, namely: avoiding the conflation of genetics with biological determinism, resisting the “technological imperative”, due consideration of the organization of screening services, the need for professional education, as well as informed decision making and public understanding. PMID:25733939

  12. Bioinformatics challenges in de novo transcriptome assembly using short read sequences in the absence of a reference genome sequence.

    PubMed

    Góngora-Castillo, Elsa; Buell, C Robin

    2013-04-01

    Plant natural product research can be facilitated through genome and transcriptome sequencing approaches that generate informative sequence and expression datasets that enable characterization of biochemical pathways of interest. As the overwhelming majority of plant-derived natural products are derived from species with little, if any, sequence and/or genomic resources, the ability to perform whole genome shotgun sequencing and assembly has been and will continue to be transformative as access to a genome sequence provides molecular resources and a context for discovery and characterization of biosynthetic pathways. Due to the reduced size and complexity of the transcriptome relative to the genome, transcriptome sequencing provides a rapid, inexpensive approach to access gene sequences, gene expression abundances, and gene expression patterns in any species, including those that lack a reference genome sequence. To date, successful applications of RNA sequencing in conjunction with de novo transcriptome assembly has enabled identification of new genes in an array of biochemical pathways in plants. While sequencing technologies are well developed, challenges remain in the handling and analysis of transcriptome sequences. In this Highlight article, we provide an overview of the bioinformatics challenges associated with transcriptome analyses using short read sequences and how to address these issues in plant species that lack a reference genome.

  13. Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes

    PubMed Central

    Rashid, Mamunur; Robles-Espinoza, Carla Daniela; Rust, Alistair G.; Adams, David J.

    2013-01-01

    Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application. Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ Contact: da1@sanger.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23803469

  14. An ethnically relevant consensus Korean reference genome is a step towards personal reference genomes

    PubMed Central

    Cho, Yun Sung; Kim, Hyunho; Kim, Hak-Min; Jho, Sungwoong; Jun, JeHoon; Lee, Yong Joo; Chae, Kyun Shik; Kim, Chang Geun; Kim, Sangsoo; Eriksson, Anders; Edwards, Jeremy S.; Lee, Semin; Kim, Byung Chul; Manica, Andrea; Oh, Tae-Kwang; Church, George M.; Bhak, Jong

    2016-01-01

    Human genomes are routinely compared against a universal reference. However, this strategy could miss population-specific and personal genomic variations, which may be detected more efficiently using an ethnically relevant or personal reference. Here we report a hybrid assembly of a Korean reference genome (KOREF) for constructing personal and ethnic references by combining sequencing and mapping methods. We also build its consensus variome reference, providing information on millions of variants from 40 additional ethnically homogeneous genomes from the Korean Personal Genome Project. We find that the ethnically relevant consensus reference can be beneficial for efficient variant detection. Systematic comparison of human assemblies shows the importance of assembly quality, suggesting the necessity of new technologies to comprehensively map ethnic and personal genomic structure variations. In the era of large-scale population genome projects, the leveraging of ethnicity-specific genome assemblies as well as the human reference genome will accelerate mapping all human genome diversity. PMID:27882922

  15. An ethnically relevant consensus Korean reference genome is a step towards personal reference genomes.

    PubMed

    Cho, Yun Sung; Kim, Hyunho; Kim, Hak-Min; Jho, Sungwoong; Jun, JeHoon; Lee, Yong Joo; Chae, Kyun Shik; Kim, Chang Geun; Kim, Sangsoo; Eriksson, Anders; Edwards, Jeremy S; Lee, Semin; Kim, Byung Chul; Manica, Andrea; Oh, Tae-Kwang; Church, George M; Bhak, Jong

    2016-11-24

    Human genomes are routinely compared against a universal reference. However, this strategy could miss population-specific and personal genomic variations, which may be detected more efficiently using an ethnically relevant or personal reference. Here we report a hybrid assembly of a Korean reference genome (KOREF) for constructing personal and ethnic references by combining sequencing and mapping methods. We also build its consensus variome reference, providing information on millions of variants from 40 additional ethnically homogeneous genomes from the Korean Personal Genome Project. We find that the ethnically relevant consensus reference can be beneficial for efficient variant detection. Systematic comparison of human assemblies shows the importance of assembly quality, suggesting the necessity of new technologies to comprehensively map ethnic and personal genomic structure variations. In the era of large-scale population genome projects, the leveraging of ethnicity-specific genome assemblies as well as the human reference genome will accelerate mapping all human genome diversity.

  16. Bioinformatic progress and applications in metaproteogenomics for bridging the gap between genomic sequences and metabolic functions in microbial communities.

    PubMed

    Seifert, Jana; Herbst, Florian-Alexander; Halkjaer Nielsen, Per; Planes, Francisco J; Jehmlich, Nico; Ferrer, Manuel; von Bergen, Martin

    2013-10-01

    Metaproteomics of microbial communities promises to add functional information to the blueprint of genes derived from metagenomics. Right from its beginning, the achievements and developments in metaproteomics were closely interlinked with metagenomics. In addition, the evaluation, visualization, and interpretation of metaproteome data demanded for the developments in bioinformatics. This review will give an overview about recent strategies to use genomic data either from public databases or organismal specific genomes/metagenomes to increase the number of identified proteins obtained by mass spectrometric measurements. We will review different published metaproteogenomic approaches in respect to the used MS pipeline and to the used protein identification workflow. Furthermore, different approaches of data visualization and strategies for phylogenetic interpretation of metaproteome data are discussed as well as approaches for functional mapping of the results to the investigated biological systems. This information will in the end allow a comprehensive analysis of interactions and interdependencies within microbial communities.

  17. Personalized genomic disease risk of volunteers

    PubMed Central

    Gonzalez-Garay, Manuel L.; McGuire, Amy L.; Pereira, Stacey; Caskey, C. Thomas

    2013-01-01

    Next-generation sequencing (NGS) is commonly used for researching the causes of genetic disorders. However, its usefulness in clinical practice for medical diagnosis is in early development. In this report, we demonstrate the value of NGS for genetic risk assessment and evaluate the limitations and barriers for the adoption of this technology into medical practice. We performed whole exome sequencing (WES) on 81 volunteers, and for each volunteer, we requested personal medical histories, constructed a three-generation pedigree, and required their participation in a comprehensive educational program. We limited our clinical reporting to disease risks based on only rare damaging mutations and known pathogenic variations in genes previously reported to be associated with human disorders. We identified 271 recessive risk alleles (214 genes), 126 dominant risk alleles (101 genes), and 3 X-recessive risk alleles (3 genes). We linked personal disease histories with causative disease genes in 18 volunteers. Furthermore, by incorporating family histories into our genetic analyses, we identified an additional five heritable diseases. Traditional genetic counseling and disease education were provided in verbal and written reports to all volunteers. Our report demonstrates that when genome results are carefully interpreted and integrated with an individual’s medical records and pedigree data, NGS is a valuable diagnostic tool for genetic disease risk. PMID:24082139

  18. Integrated Bioinformatics, Environmental Epidemiologic and Genomic Approaches to Identify Environmental and Molecular Links between Endometriosis and Breast Cancer

    PubMed Central

    Roy, Deodutta; Morgan, Marisa; Yoo, Changwon; Deoraj, Alok; Roy, Sandhya; Yadav, Vijay Kumar; Garoub, Mohannad; Assaggaf, Hamza; Doke, Mayur

    2015-01-01

    We present a combined environmental epidemiologic, genomic, and bioinformatics approach to identify: exposure of environmental chemicals with estrogenic activity; epidemiologic association between endocrine disrupting chemical (EDC) and health effects, such as, breast cancer or endometriosis; and gene-EDC interactions and disease associations. Human exposure measurement and modeling confirmed estrogenic activity of three selected class of environmental chemicals, polychlorinated biphenyls (PCBs), bisphenols (BPs), and phthalates. Meta-analysis showed that PCBs exposure, not Bisphenol A (BPA) and phthalates, increased the summary odds ratio for breast cancer and endometriosis. Bioinformatics analysis of gene-EDC interactions and disease associations identified several hundred genes that were altered by exposure to PCBs, phthalate or BPA. EDCs-modified genes in breast neoplasms and endometriosis are part of steroid hormone signaling and inflammation pathways. All three EDCs–PCB 153, phthalates, and BPA influenced five common genes—CYP19A1, EGFR, ESR2, FOS, and IGF1—in breast cancer as well as in endometriosis. These genes are environmentally and estrogen responsive, altered in human breast and uterine tumors and endometriosis lesions, and part of Mitogen Activated Protein Kinase (MAPK) signaling pathways in cancer. Our findings suggest that breast cancer and endometriosis share some common environmental and molecular risk factors. PMID:26512648

  19. ImageJS: Personalized, participated, pervasive, and reproducible image bioinformatics in the web browser

    PubMed Central

    Almeida, Jonas S.; Iriabho, Egiebade E.; Gorrepati, Vijaya L.; Wilkinson, Sean R.; Grüneberg, Alexander; Robbins, David E.; Hackney, James R.

    2012-01-01

    Background: Image bioinformatics infrastructure typically relies on a combination of server-side high-performance computing and client desktop applications tailored for graphic rendering. On the server side, matrix manipulation environments are often used as the back-end where deployment of specialized analytical workflows takes place. However, neither the server-side nor the client-side desktop solution, by themselves or combined, is conducive to the emergence of open, collaborative, computational ecosystems for image analysis that are both self-sustained and user driven. Materials and Methods: ImageJS was developed as a browser-based webApp, untethered from a server-side backend, by making use of recent advances in the modern web browser such as a very efficient compiler, high-end graphical rendering capabilities, and I/O tailored for code migration. Results: Multiple versioned code hosting services were used to develop distinct ImageJS modules to illustrate its amenability to collaborative deployment without compromise of reproducibility or provenance. The illustrative examples include modules for image segmentation, feature extraction, and filtering. The deployment of image analysis by code migration is in sharp contrast with the more conventional, heavier, and less safe reliance on data transfer. Accordingly, code and data are loaded into the browser by exactly the same script tag loading mechanism, which offers a number of interesting applications that would be hard to attain with more conventional platforms, such as NIH's popular ImageJ application. Conclusions: The modern web browser was found to be advantageous for image bioinformatics in both the research and clinical environments. This conclusion reflects advantages in deployment scalability and analysis reproducibility, as well as the critical ability to deliver advanced computational statistical procedures machines where access to sensitive data is controlled, that is, without local “download and

  20. Genomic Discoveries and Personalized Medicine in Neurological Diseases.

    PubMed

    Zhang, Li; Hong, Huixiao

    2015-12-07

    In the past decades, we have witnessed dramatic changes in clinical diagnoses and treatments due to the revolutions of genomics and personalized medicine. Undoubtedly we also met many challenges when we use those advanced technologies in drug discovery and development. In this review, we describe when genomic information is applied in personal healthcare in general. We illustrate some case examples of genomic discoveries and promising personalized medicine applications in the area of neurological disease particular. Available data suggest that individual genomics can be applied to better treat patients in the near future.

  1. Personalized medicine, genomics, and pharmacogenomics: a primer for nurses.

    PubMed

    Blix, Andrew

    2014-08-01

    Personalized medicine is the study of patients' unique environmental influences as well as the totality of their genetic code-their genome-to tailor personalized risk assessments, diagnoses, prognoses, and treatments. The study of how patients' genomes affect responses to medications, or pharmacogenomics, is a related field. Personalized medicine and genomics are particularly relevant in oncology because of the genetic basis of cancer. Nurses need to understand related issues such as the role of genetic and genomic counseling, the ethical and legal questions surrounding genomics, and the growing direct-to-consumer genomics industry. As genomics research is incorporated into health care, nurses need to understand the technology to provide advocacy and education for patients and their families.

  2. A Novel Bioinformatics Method for Efficient Knowledge Discovery by BLSOM from Big Genomic Sequence Data

    PubMed Central

    Iwasaki, Yuki; Kanaya, Shigehiko; Zhao, Yue; Ikemura, Toshimichi

    2014-01-01

    With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a “genome signature,” and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data). PMID:24804244

  3. New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome.

    PubMed

    Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton

    2015-01-01

    Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of "domestication" of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci.

  4. A bioinformatics approach to reanalyze the genome annotation of kinetoplastid protozoan parasite Leishmania donovani.

    PubMed

    Pawar, Harsh; Kulkarni, Aditi; Dixit, Tanwi; Chaphekar, Deepa; Patole, Milind S

    2014-12-01

    Leishmania donovani is a kinetoplastid protozoan parasite which causes the fatal disease visceral leishmaniasis in humans. Genome sequencing of L. donovani revealed information about the arrangement of genes and genome architecture. After curation of the genome sequence, many genes in L. donovani were assigned as truncated or "partial" genes by the genome sequencing group. In the present study, we have carried out an extensive analysis and attempted to improve the gene models of these partial genes. Our analysis resulted in the identification of 308 partial genes in L. donovani, which were further categorized as C-terminal extensions, joining of genes, tandemly repeated paralogs and wrong chromosomal assignments. We have analyzed each of these genes from these categories and have improved the annotation of existing gene models in L. donovani. Some of these corrections have been confirmed by mass spectrometry derived peptide data from our previous comparative proteogenomics study in L. donovani.

  5. Genome re-sequencing and bioinformatics analysis of a nutraceutical rice.

    PubMed

    Lin, Juncheng; Cheng, Zuxin; Xu, Ming; Huang, Zhiwei; Yang, Zhijian; Huang, Xinying; Zheng, Jingui; Lin, Tongxiang

    2015-06-01

    The genomes of two rice cultivars, Nipponbare and 93-11, have been well studied. However, there is little available genetic information about nutraceutical rice cultivars. To remedy this situation, the present study aimed to provide a basic genetic landscape of nutraceutical rice. The genome of Black-1, a black pericarp rice containing higher levels of anthocyanins, flavonoids, and a more potent antioxidant capacity, was sequenced at ≥30 × coverage using Solexa sequencing technology. The complete sequences of Black-1 genome shared more consensus sequences with indica cultivar 93-11 than with Nipponbare. With reference to the 93-11 genome, Black-1 contained 675,207 single-nucleotide polymorphisms, 43,130 insertions and deletions (1-5 bp), 1,770 copy number variations, and 10,911 presence/absence variations. These variations were observed to reside preferentially in Myb domains, NB-ARC domains and kinase domains, providing clues to the diversity of biological functions or secondary metabolisms in this cultivar. Intriguingly, 496 unique genes were identified by comparing it with the genomes of these two rice varieties; among the genes, 119 genes participate in the biosynthesis of secondary metabolites. Furthermore, several unique genes were predicted to be involved in the anthocyanins synthesis pathway. The genome-wide landscape of Black-1 uncovered by this study represents a valuable resource for further studies and for breeding nutraceutical rice varieties.

  6. Personal utility in genomic testing: is there such a thing?

    PubMed

    Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N

    2015-04-01

    In ethical and regulatory discussions on new applications of genomic testing technologies, the notion of 'personal utility' has been mentioned repeatedly. It has been used to justify direct access to commercially offered genomic testing or feedback of individual research results to research or biobank participants. Sometimes research participants or consumers claim a right to genomic information with an appeal to personal utility. As of yet, no systematic account of the umbrella notion of personal utility has been given. This paper offers a definition of personal utility that places it in the middle of the spectrum between clinical utility and personal perceptions of utility, and that acknowledges its normative charge. The paper discusses two perspectives on personal utility, the healthcare perspective and the consumer perspective, and argues that these are too narrow and too wide, respectively. Instead, it proposes a normative definition of personal utility that postulates information and potential use as necessary conditions of utility. This definition entails that perceived utility does not equal personal utility, and that expert judgment may be necessary to help determine whether a genomic test can have personal utility for someone. Two examples of genomic tests are presented to illustrate the discrepancies between perceived utility and our proposed definition of personal utility. The paper concludes that while there is room for the notion of personal utility in the ethical evaluation and regulation of genomic tests, the justificatory role of personal utility is not unlimited. For in the absence of clinical validity and reasonable potential use of information, there is no personal utility.

  7. Genomic Analysis of a Marine Bacterium: Bioinformatics for Comparison, Evaluation, and Interpretation of DNA Sequences

    PubMed Central

    Khobragade, Chandrahasya N.

    2016-01-01

    A total of five highly related strains of an unidentified marine bacterium were analyzed through their short genome sequences (AM260709–AM260713). Genome-to-Genome Distance (GGDC) showed high similarity to Pseudoalteromonas haloplanktis (X67024). The generated unique Quick Response (QR) codes indicated no identity to other microbial species or gene sequences. Chaos Game Representation (CGR) showed the number of bases concentrated in the area. Guanine residues were highest in number followed by cytosine. Frequency of Chaos Game Representation (FCGR) indicated that CC and GG blocks have higher frequency in the sequence from the evaluated marine bacterium strains. Maximum GC content for the marine bacterium strains ranged 53-54%. The use of QR codes, CGR, FCGR, and GC dataset helped in identifying and interpreting short genome sequences from specific isolates. A phylogenetic tree was constructed with the bootstrap test (1000 replicates) using MEGA6 software. Principal Component Analysis (PCA) was carried out using EMBL-EBI MUSCLE program. Thus, generated genomic data are of great assistance for hierarchical classification in Bacterial Systematics which combined with phenotypic features represents a basic procedure for a polyphasic approach on unambiguous bacterial isolate taxonomic classification. PMID:27882328

  8. Personalized Genomic Medicine with a Patchwork, Partially Owned Genome

    PubMed Central

    Mason, Christopher E.; Seringhaus, Michael R.; Sattler de Sousa e Brito, Clara

    2008-01-01

    “His book was known as the Book of Sand, because neither the book nor the sand have any beginning or end.” — Jorge Luis Borges The human genome is a three billion-letter recipe for the genesis of a human being, directing development from a single-celled embryo to the trillions of adult cells. Since the sequencing of the human genome was announced in 2001, researchers have an increased ability to discern the genetic basis for diseases. This reference genome has opened the door to genomic medicine, aimed at detecting and understanding all genetic variations of the human genome that contribute to the manifestation and progression of disease. The overarching vision of genomic (or “personalized”) medicine is to custom-tailor each treatment for maximum effectiveness in an individual patient. Detecting the variation in a patient’s deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and protein structures is no longer an insurmountable hurdle. Today, the challenge for genomic medicine lies in contextualizing those myriad genetic variations in terms of their functional consequences for a person’s health and development throughout life and in terms of that patient’s susceptibility to disease and differential clinical responses to medication. Additionally, several recent developments have complicated our understanding of the nominal human genome and, thereby, altered the progression of genomic medicine. In this brief review, we shall focus on these developments and examine how they are changing our understanding of our genome. PMID:18449389

  9. [Personal genomics: are we debating the right Issues?].

    PubMed

    Vayena, E; Mauch, F

    2012-07-25

    The debate about personal genomics and their role in personalized medicine has been, to some extent, hijacked by the controversy about commercially available genomic tests sold directly to consumers. The clinical validity and utility of such tests are currently limited and most medical associations recommend that consumers refrain from testing. Conversely, DTC genomics proponents and particularly the DTC industry argue that there is personal utility in acquiring genomic information. While it is necessary to debate risks and benefits of DTC genomics, we should not lose sight of the increasingly important role that genomics will play in medical practice and public health. Therefore, and in anticipation of this shift we also need to focus on important implications from the use of genomics information such as genetic discrimination, privacy protection and equitable access to health care. Undoubtedly, personal genomics will challenge our social norms maybe more than our medicine. Sticking to the polarization of «to have or not to have DTC genomics» risks to takes us away from the critical issues we need to be debating.

  10. "A system biology" approach to bioinformatics and functional genomics in complex human diseases: arthritis.

    PubMed

    Attur, M G; Dave, M N; Tsunoyama, K; Akamatsu, M; Kobori, M; Miki, J; Abramson, S B; Katoh, M; Amin, A R

    2002-10-01

    Human and other annotated genome sequences have facilitated generation of vast amounts of correlative data, from human/animal genetics, normal and disease-affected tissues from complex diseases such as arthritis using gene/protein chips and SNP analysis. These data sets include genes/proteins whose functions are partially known at the cellular level or may be completely unknown (e.g. ESTs). Thus, genomic research has transformed molecular biology from "data poor" to "data rich" science, allowing further division into subpopulations of subcellular fractions, which are often given an "-omic" suffix. These disciplines have to converge at a systemic level to examine the structure and dynamics of cellular and organismal function. The challenge of characterizing ESTs linked to complex diseases is like interpreting sharp images on a blurred background and therefore requires a multidimensional screen for functional genomics ("functionomics") in tissues, mice and zebra fish model, which intertwines various approaches and readouts to study development and homeostasis of a system. In summary, the post-genomic era of functionomics will facilitate to narrow the bridge between correlative data and causative data by quaint hypothesis-driven research using a system approach integrating "intercoms" of interacting and interdependent disciplines forming a unified whole as described in this review for Arthritis.

  11. Bioinformatic genome comparisons for taxonomic and phylogenic assignments using Aeromonas as a test case

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Prokaryotic taxonomy is the underpinning of microbiology, providing a framework for the proper identification and naming of organisms. The 'gold standard' of bacterial species delineation is the overall genome similarity as determined by DNA-DNA hybridization (DDH), a technically rigorous yet someti...

  12. A bioinformatic approach to understanding antibiotic resistance in intracellular bacteria through whole genome analysis.

    PubMed

    Biswas, Silpak; Raoult, Didier; Rolain, Jean-Marc

    2008-09-01

    Intracellular bacteria survive within eukaryotic host cells and are difficult to kill with certain antibiotics. As a result, antibiotic resistance in intracellular bacteria is becoming commonplace in healthcare institutions. Owing to the lack of methods available for transforming these bacteria, we evaluated the mechanisms of resistance using molecular methods and in silico genome analysis. The objective of this review was to understand the molecular mechanisms of antibiotic resistance through in silico comparisons of the genomes of obligate and facultative intracellular bacteria. The available data on in vitro mutants reported for intracellular bacteria were also reviewed. These genomic data were analysed to find natural mutations in known target genes involved in antibiotic resistance and to look for the presence or absence of different resistance determinants. Our analysis revealed the presence of tetracycline resistance protein (Tet) in Bartonella quintana, Francisella tularensis and Brucella ovis; moreover, most of the Francisella strains possessed the blaA gene, AmpG protein and metallo-beta-lactamase family protein. The presence or absence of folP (dihydropteroate synthase) and folA (dihydrofolate reductase) genes in the genome could explain natural resistance to co-trimoxazole. Finally, multiple genes encoding different efflux pumps were studied. This in silico approach was an effective method for understanding the mechanisms of antibiotic resistance in intracellular bacteria. The whole genome sequence analysis will help to predict several important phenotypic characteristics, in particular resistance to different antibiotics. In the future, stable mutants should be obtained through transformation methods in order to demonstrate experimentally the determinants of resistance in intracellular bacteria.

  13. Personal Genomic Information Management and Personalized Medicine: Challenges, Current Solutions, and Roles of HIM Professionals

    PubMed Central

    Alzu'bi, Amal; Zhou, Leming; Watzlaf, Valerie

    2014-01-01

    In recent years, the term personalized medicine has received more and more attention in the field of healthcare. The increasing use of this term is closely related to the astonishing advancement in DNA sequencing technologies and other high-throughput biotechnologies. A large amount of personal genomic data can be generated by these technologies in a short time. Consequently, the needs for managing, analyzing, and interpreting these personal genomic data to facilitate personalized care are escalated. In this article, we discuss the challenges for implementing genomics-based personalized medicine in healthcare, current solutions to these challenges, and the roles of health information management (HIM) professionals in genomics-based personalized medicine. PMID:24808804

  14. Teaching Synthetic Biology, Bioinformatics and Engineering to Undergraduates: The Interdisciplinary Build-a-Genome Course

    PubMed Central

    Dymond, Jessica S.; Scheifele, Lisa Z.; Richardson, Sarah; Lee, Pablo; Chandrasegaran, Srinivasan; Bader, Joel S.; Boeke, Jef D.

    2009-01-01

    A major challenge in undergraduate life science curricula is the continual evaluation and development of courses that reflect the constantly shifting face of contemporary biological research. Synthetic biology offers an excellent framework within which students may participate in cutting-edge interdisciplinary research and is therefore an attractive addition to the undergraduate biology curriculum. This new discipline offers the promise of a deeper understanding of gene function, gene order, and chromosome structure through the de novo synthesis of genetic information, much as synthetic approaches informed organic chemistry. While considerable progress has been achieved in the synthesis of entire viral and prokaryotic genomes, fabrication of eukaryotic genomes requires synthesis on a scale that is orders of magnitude higher. These high-throughput but labor-intensive projects serve as an ideal way to introduce undergraduates to hands-on synthetic biology research. We are pursuing synthesis of Saccharomyces cerevisiae chromosomes in an undergraduate laboratory setting, the Build-a-Genome course, thereby exposing students to the engineering of biology on a genomewide scale while focusing on a limited region of the genome. A synthetic chromosome III sequence was designed, ordered from commercial suppliers in the form of oligonucleotides, and subsequently assembled by students into ∼750-bp fragments. Once trained in assembly of such DNA “building blocks” by PCR, the students accomplish high-yield gene synthesis, becoming not only technically proficient but also constructively critical and capable of adapting their protocols as independent researchers. Regular “lab meeting” sessions help prepare them for future roles in laboratory science. PMID:19015540

  15. Air Force Genomics, Proteomics, Bioinformatics System, DataCap-Data Collection Module. Phase 1: Development

    DTIC Science & Technology

    2004-07-01

    exist as a series of isolated computational silos, providing a depth of data in a narrow field of research. The Acero Genomics Knowledge Platform (GKP...on top of the Acero Platform. The purpose of the DataCap is to provide the individual researcher with the ability to collect experimental data in a...integrated format compatible with the Acero GKP. This technical report covers the architecture, the design and the operation of the DataCap in its

  16. Teaching synthetic biology, bioinformatics and engineering to undergraduates: the interdisciplinary Build-a-Genome course.

    PubMed

    Dymond, Jessica S; Scheifele, Lisa Z; Richardson, Sarah; Lee, Pablo; Chandrasegaran, Srinivasan; Bader, Joel S; Boeke, Jef D

    2009-01-01

    A major challenge in undergraduate life science curricula is the continual evaluation and development of courses that reflect the constantly shifting face of contemporary biological research. Synthetic biology offers an excellent framework within which students may participate in cutting-edge interdisciplinary research and is therefore an attractive addition to the undergraduate biology curriculum. This new discipline offers the promise of a deeper understanding of gene function, gene order, and chromosome structure through the de novo synthesis of genetic information, much as synthetic approaches informed organic chemistry. While considerable progress has been achieved in the synthesis of entire viral and prokaryotic genomes, fabrication of eukaryotic genomes requires synthesis on a scale that is orders of magnitude higher. These high-throughput but labor-intensive projects serve as an ideal way to introduce undergraduates to hands-on synthetic biology research. We are pursuing synthesis of Saccharomyces cerevisiae chromosomes in an undergraduate laboratory setting, the Build-a-Genome course, thereby exposing students to the engineering of biology on a genomewide scale while focusing on a limited region of the genome. A synthetic chromosome III sequence was designed, ordered from commercial suppliers in the form of oligonucleotides, and subsequently assembled by students into approximately 750-bp fragments. Once trained in assembly of such DNA "building blocks" by PCR, the students accomplish high-yield gene synthesis, becoming not only technically proficient but also constructively critical and capable of adapting their protocols as independent researchers. Regular "lab meeting" sessions help prepare them for future roles in laboratory science.

  17. Personalized- and one- medicine: bioinformatics foundation in health and its economic feasibility.

    PubMed

    Stefano, George B; Kream, Richard M

    2015-01-15

    Personalized medicine's foundation rests on the use of molecular technologies, which are being used to identify genetic mutations, polymorphisms, and variants that can be associated with an individual's genetic make up, revealing risk factors and predictive data. Needless to say this same analysis can be performed on various types of cancers, including samples stored for many years under the right conditions. For the most part, these technologies employ microarray and RNA-Seq methodologies, which examine large numbers of gene expressions at a time, providing clustering and patterns of this expression. The methodologies and their evaluative outcomes further demonstrate that more than a single gene is involved with various phenomena. However, given the mass of data emerging from this analysis, and commonalities they reveal between various phenomena/disorders, achieving 100% certainty may not be that easy. Another outcome from this massive store of molecular data is the concept of one medicine. This field has been developed by researchers in a variety of disciplines (e.g., medical and veterinary science) that advocate for greater integration of animal and human health. One medicine takes advantage of the fact that molecular commonalities in major biochemical pathways occur because of evolutionary conservation, which is dependent on stereospecificity. In this regard, the foci of personalized medicine and one medicine are quite broad and require trained professionals, as well as a lowering of cost in order to be better integrated into mainstream medical practice.

  18. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes

    PubMed Central

    2013-01-01

    Background Ribonucleotide reductase (RNR), the enzyme responsible for the formation of deoxyribonucleotides from ribonucleotides, is found in all domains of life and many viral genomes. RNRs are also amongst the most abundant genes identified in environmental metagenomes. This study focused on understanding the distribution, diversity, and evolution of RNRs in phages (viruses that infect bacteria). Hidden Markov Model profiles were used to analyze the proteins encoded by 685 completely sequenced double-stranded DNA phages and 22 environmental viral metagenomes to identify RNR homologs in cultured phages and uncultured viral communities, respectively. Results RNRs were identified in 128 phage genomes, nearly tripling the number of phages known to encode RNRs. Class I RNR was the most common RNR class observed in phages (70%), followed by class II (29%) and class III (28%). Twenty-eight percent of the phages contained genes belonging to multiple RNR classes. RNR class distribution varied according to phage type, isolation environment, and the host’s ability to utilize oxygen. The majority of the phages containing RNRs are Myoviridae (65%), followed by Siphoviridae (30%) and Podoviridae (3%). The phylogeny and genomic organization of phage and host RNRs reveal several distinct evolutionary scenarios involving horizontal gene transfer, co-evolution, and differential selection pressure. Several putative split RNR genes interrupted by self-splicing introns or inteins were identified, providing further evidence for the role of frequent genetic exchange. Finally, viral metagenomic data indicate that RNRs are prevalent and highly dynamic in uncultured viral communities, necessitating future research to determine the environmental conditions under which RNRs provide a selective advantage. Conclusions This comprehensive study describes the distribution, diversity, and evolution of RNRs in phage genomes and environmental viral metagenomes. The distinct distributions of

  19. Bioinformatic evidence and characterization of novel putative large conjugative transposons residing in genomes of genera Bacteroides and Prevotella.

    PubMed

    Gorenc, Katja; Accetto, Tomaž; Avguštin, Gorazd

    2012-07-01

    Bioinformatic evidence of the presence of a large conjugative transposon in ruminal bacterium Prevotella bryantii B(1)4(T) is presented. The described transposon appears to be related to another large conjugative transposon CTnBST, described in Bacteroides uniformis WH207 and to the conjugative transposon CTn3-Bf, which was observed in the genome of Bacteroides fragilis strain YCH46. All three transposons share tra gene regions with high amino acid identity and clearly conserved gene order. Additionally, a second conserved region consisting of hypothetical genes was discovered in all three transposons and named the GG region. This region served as a specific sequence signature and made possible the discovery of several other apparently related hypothetical conjugative transposons in bacteria from the genus Bacteroides. A cluster of genes involved in sugar utilization and metabolism was discovered within the hypothetical CTnB(1)4, to a certain extent resembling the polysaccharide utilization loci which were described recently in some Bacteroides strains. This is the first firm report on the presence of a large mobile genetic element in any strain from the genus Prevotella.

  20. Genome mining of mycosporine-like amino acid (MAA) synthesizing and non-synthesizing cyanobacteria: A bioinformatics study.

    PubMed

    Singh, Shailendra P; Klisch, Manfred; Sinha, Rajeshwar P; Häder, Donat-P

    2010-02-01

    Mycosporine-like amino acids (MAAs) are a family of more than 20 compounds having absorption maxima between 310 and 362 nm. These compounds are well known for their UV-absorbing/screening role in various organisms and seem to have evolutionary significance. In the present investigation we tested four cyanobacteria, e.g., Anabaena variabilis PCC 7937, Anabaena sp. PCC 7120, Synechocystis sp. PCC 6803 and Synechococcus sp. PCC 6301, for their ability to synthesize MAA and conducted genomic and phylogenetic analysis to identify the possible set of genes that might be involved in the biosynthesis of these compounds. Out of the four investigated species, only A. variabilis PCC 7937 was able to synthesize MAA. Genome mining identified a combination of genes, YP_324358 (predicted DHQ synthase) and YP_324357 (O-methyltransferase), which were present only in A. variabilis PCC 7937 and missing in the other studied cyanobacteria. Phylogenetic analysis revealed that these two genes are transferred from a cyanobacterial donor to dinoflagellates and finally to metazoa by a lateral gene transfer event. All other cyanobacteria, which have these two genes, also had another copy of the DHQ synthase gene. The predicted protein structure for YP_324358 also suggested that this product is different from the chemically characterized DHQ synthase of Aspergillus nidulans contrary to the YP_324879, which was predicted to be similar to the DHQ synthase. The present study provides a first insight into the genes of cyanobacteria involved in MAA biosynthesis and thus widens the field of research for molecular, bioinformatics and phylogenetic analysis of these evolutionary and industrially important compounds. Based on the results we propose that YP_324358 and YP_324357 gene products are involved in the biosynthesis of the common core (deoxygadusol) of all MAAs.

  1. AnnoTALE: bioinformatics tools for identification, annotation, and nomenclature of TALEs from Xanthomonas genomic sequences

    PubMed Central

    Grau, Jan; Reschke, Maik; Erkes, Annett; Streubel, Jana; Morgan, Richard D.; Wilson, Geoffrey G.; Koebnik, Ralf; Boch, Jens

    2016-01-01

    Transcription activator-like effectors (TALEs) are virulence factors, produced by the bacterial plant-pathogen Xanthomonas, that function as gene activators inside plant cells. Although the contribution of individual TALEs to infectivity has been shown, the specific roles of most TALEs, and the overall TALE diversity in Xanthomonas spp. is not known. TALEs possess a highly repetitive DNA-binding domain, which is notoriously difficult to sequence. Here, we describe an improved method for characterizing TALE genes by the use of PacBio sequencing. We present ‘AnnoTALE’, a suite of applications for the analysis and annotation of TALE genes from Xanthomonas genomes, and for grouping similar TALEs into classes. Based on these classes, we propose a unified nomenclature for Xanthomonas TALEs that reveals similarities pointing to related functionalities. This new classification enables us to compare related TALEs and to identify base substitutions responsible for the evolution of TALE specificities. PMID:26876161

  2. [Genome-wide identification and bioinformatic analysis of PPR gene family in tomato].

    PubMed

    Ding, Anming; Li, Ling; Qu, Xu; Sun, Tingting; Chen, Yaqiong; Zong, Peng; Li, Zunqiang; Gong, Daping; Sun, Yuhe

    2014-01-01

    Pentatricopeptide repeats (PPRs) genes constitute one of the largest gene families in plants, which play a broad and essential role in plant growth and development. In this study, the protein sequences annotated by the tomato (S. lycopersicum L.) genome project were screened with the Pfam PPR sequences. A total of 471 putative PPR-encoding genes were identified. Based on the motifs defined in A. thaliana L., protein structure and conserved sequences for each tomato motif were analyzed. We also analyzed phylogenetic relationship, subcellular localization, expression and GO analysis of the identified gene sequences. Our results demonstrate that tomato PPR gene family contains two subfamilies, P and PLS, each accounting for half of the family. PLS subfamily can be divided into four subclasses i.e., PLS, E, E+ and DYW. Each subclass of sequences forms a clade in the phylogenetic tree. The PPR motifs were found highly conserved among plants. The tomato PPR genes were distributed over 12 chromosomes and most of them lack introns. The majority of PPR proteins harbor mitochondrial or chloroplast localization sequences, whereas GO analysis showed that most PPR proteins participate in RNA-related biological processes.

  3. lobSTR: A short tandem repeat profiler for personal genomes

    PubMed Central

    Gymrek, Melissa; Golan, David; Rosset, Saharon; Erlich, Yaniv

    2012-01-01

    Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat STR mapping as gapped alignment, which results in cumbersome processing times and a biased sampling of STR alleles. Here, we present lobSTR, a novel method for profiling STRs in personal genomes. lobSTR harnesses concepts from signal processing and statistical learning to avoid gapped alignment and to address the specific noise patterns in STR calling. The speed and reliability of lobSTR exceed the performance of current mainstream algorithms for STR profiling. We validated lobSTR's accuracy by measuring its consistency in calling STRs from whole-genome sequencing of two biological replicates from the same individual, by tracing Mendelian inheritance patterns in STR alleles in whole-genome sequencing of a HapMap trio, and by comparing lobSTR results to traditional molecular techniques. Encouraged by the speed and accuracy of lobSTR, we used the algorithm to conduct a comprehensive survey of STR variations in a deeply sequenced personal genome. We traced the mutation dynamics of close to 100,000 STR loci and observed more than 50,000 STR variations in a single genome. lobSTR's implementation is an end-to-end solution. The package accepts raw sequencing reads and provides the user with the genotyping results. It is written in C/C++, includes multi-threading capabilities, and is compatible with the BAM format. PMID:22522390

  4. Personal Genomic Testing for Cancer Risk: Results From the Impact of Personal Genomics Study.

    PubMed

    Gray, Stacy W; Gollust, Sarah E; Carere, Deanna Alexis; Chen, Clara A; Cronin, Angel; Kalia, Sarah S; Rana, Huma Q; Ruffin, Mack T; Wang, Catharine; Roberts, J Scott; Green, Robert C

    2017-02-20

    Purpose Significant concerns exist regarding the potential for unwarranted behavior changes and the overuse of health care resources in response to direct-to-consumer personal genomic testing (PGT). However, little is known about customers' behaviors after PGT. Methods Longitudinal surveys were given to new customers of 23andMe (Mountain View, CA) and Pathway Genomics (San Diego, CA). Survey data were linked to individual-level PGT results through a secure data transfer process. Results Of the 1,042 customers who completed baseline and 6-month surveys (response rate, 71.2%), 762 had complete cancer-related data and were analyzed. Most customers reported that learning about their genetic risk of cancers was a motivation for testing (colorectal, 88%; prostate, 95%; breast, 94%). No customers tested positive for pathogenic mutations in highly penetrant cancer susceptibility genes. A minority of individuals received elevated single nucleotide polymorphism-based PGT cancer risk estimates (colorectal, 24%; prostate, 24%; breast, 12%). At 6 months, customers who received elevated PGT cancer risk estimates were not significantly more likely to change their diet, exercise, or advanced planning behaviors or engage in cancer screening, compared with individuals at average or reduced risk. Men who received elevated PGT prostate cancer risk estimates changed their vitamin and supplement use more than those at average or reduced risk (22% v 7.6%, respectively; adjusted odds ratio, 3.41; 95% CI, 1.44 to 8.18). Predictors of 6-month behavior include baseline behavior (exercise, vitamin or supplement use, and screening), worse health status (diet and vitamin or supplement use), and older age (advanced planning, screening). Conclusion Most adults receiving elevated direct-to-consumer PGT single nucleotide polymorphism-based cancer risk estimates did not significantly change their diet, exercise, advanced care planning, or cancer screening behaviors.

  5. Pharmacogenetics--genomics and personalized psychiatry.

    PubMed

    Möller, H J; Rujescu, D

    2010-06-01

    Pharmacogenetic influences on therapeutic response to e.g. antidepressant or neuroleptic treatment are poorly understood and the lack of efficacy in many of the patients together with side effects can both limit therapy and compliance. Thus the aim of pharmacogenetics and pharmacogenomics is to provide predictive tools for the response to psychopharmacologic agents in the therapy of psychiatric disorders and in that ways to provide a real personalized psychiatry. The following review will summarize the current stage of pharmacogenetics and pharmacogenomics and will critically discuss the possibilities of a personalized medicine.

  6. A security system for personal genome information at DNA level.

    PubMed

    Kawazoe, Yumi; Shiba, Toshikazu; Yamamoto, Masahito; Ohuchi, Azuma

    2002-01-01

    The personal information encoded in genomic DNA should not be made available to the public. With the increasing discoveries of new genes, it has become necessary to establish a security system for personal genome information. Although many security systems that are applied for electrical information in computers have been developed and established, there is no security system for information at DNA level. In this paper, we describe a new security system for information encoded within DNA. The original genomic DNA was mixed with many kinds of dummy DNAs (mixtures of natural and/or artificial DNAs) resulting in the masking of the original information. Using these dummy molecules, we succeeded to completely 'lock'the original genome information. If this information must be 'unlocked', it can be extracted and analyzed by a removal of dummy DNAs using molecular tagging techniques or by selective amplification using key primers.

  7. Personalized genomic medicine: lessons from the exome.

    PubMed

    Solomon, Benjamin D; Pineda-Alvarez, Daniel E; Hadley, Donald W; NISC Comparative Sequencing Program; Teer, Jamie K; Cherukuri, Praveen F; Hansen, Nancy F; Cruz, Pedro; Young, Alice C; Blakesley, Robert W; Lanpher, Brendan; Mayfield Gibson, Stephanie; Sincan, Murat; Chandrasekharappa, Settara C; Mullikin, James C

    2011-01-01

    While genomic sequencing methods are powerful tools in the discovery of the genetic underpinnings of human disease, incidentally-revealed novel genomic risk factors may be equally important, both scientifically, and as relates to direct patient care. We performed whole-exome sequencing on a child with VACTERL association who suffered severe post-surgical neonatal pulmonary hypertension, and identified a potential novel genetic risk factor for this complication: a heterozygous mutation in CPSI. Newborn screening results from this patient's monozygotic twin provided evidence that this mutation, in combination with an environmental trigger (in this case, surgery), may have resulted in pulmonary artery hypertension due to inadequate nitric oxide production. Identification of this genetic risk factor allows for targeted medical preventative measures in this patient as well as relatives with the same mutation, and illustrates the power of incidental medical information unearthed by whole-exome sequencing.

  8. Genomic-Bioinformatic Analysis of Transcripts Enriched in the Third-Stage Larva of the Parasitic Nematode Ascaris suum

    PubMed Central

    Huang, Cui-Qin; Gasser, Robin B.; Cantacessi, Cinzia; Nisbet, Alasdair J.; Zhong, Weiwei; Sternberg, Paul W.; Loukas, Alex; Mulvenna, Jason; Lin, Rui-Qing; Chen, Ning; Zhu, Xing-Quan

    2008-01-01

    Differential transcription in Ascaris suum was investigated using a genomic-bioinformatic approach. A cDNA archive enriched for molecules in the infective third-stage larva (L3) of A. suum was constructed by suppressive-subtractive hybridization (SSH), and a subset of cDNAs from 3075 clones subjected to microarray analysis using cDNA probes derived from RNA from different developmental stages of A. suum. The cDNAs (n = 498) shown by microarray analysis to be enriched in the L3 were sequenced and subjected to bioinformatic analyses using a semi-automated pipeline (ESTExplorer). Using gene ontology (GO), 235 of these molecules were assigned to ‘biological process’ (n = 68), ‘cellular component’ (n = 50), or ‘molecular function’ (n = 117). Of the 91 clusters assembled, 56 molecules (61.5%) had homologues/orthologues in the free-living nematodes Caenorhabditis elegans and C. briggsae and/or other organisms, whereas 35 (38.5%) had no significant similarity to any sequences available in current gene databases. Transcripts encoding protein kinases, protein phosphatases (and their precursors), and enolases were abundantly represented in the L3 of A. suum, as were molecules involved in cellular processes, such as ubiquitination and proteasome function, gene transcription, protein–protein interactions, and function. In silico analyses inferred the C. elegans orthologues/homologues (n = 50) to be involved in apoptosis and insulin signaling (2%), ATP synthesis (2%), carbon metabolism (6%), fatty acid biosynthesis (2%), gap junction (2%), glucose metabolism (6%), or porphyrin metabolism (2%), although 34 (68%) of them could not be mapped to a specific metabolic pathway. Small numbers of these 50 molecules were predicted to be secreted (10%), anchored (2%), and/or transmembrane (12%) proteins. Functionally, 17 (34%) of them were predicted to be associated with (non-wild-type) RNAi phenotypes in C. elegans, the majority being embryonic lethality

  9. Genome Science and Personalized Cancer Treatment

    ScienceCinema

    Gray, Joe

    2016-07-12

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  10. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-07

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  11. Genome-wide bioinformatics analysis of steroid metabolism-associated genes in Nocardioides simplex VKM Ac-2033D.

    PubMed

    Shtratnikova, Victoria Y; Schelkunov, Mikhail I; Fokina, Victoria V; Pekov, Yury A; Ivashina, Tanya; Donova, Marina V

    2016-08-01

    Actinobacteria comprise diverse groups of bacteria capable of full degradation, or modification of different steroid compounds. Steroid catabolism has been characterized best for the representatives of suborder Corynebacterineae, such as Mycobacteria, Rhodococcus and Gordonia, with high content of mycolic acids in the cell envelope, while it is poorly understood for other steroid-transforming actinobacteria, such as representatives of Nocardioides genus belonging to suborder Propionibacterineae. Nocardioides simplex VKM Ac-2033D is an important biotechnological strain which is known for its ability to introduce ∆(1)-double bond in various 1(2)-saturated 3-ketosteroids, and perform convertion of 3β-hydroxy-5-ene steroids to 3-oxo-4-ene steroids, hydrolysis of acetylated steroids, reduction of carbonyl groups at C-17 and C-20 of androstanes and pregnanes, respectively. The strain is also capable of utilizing cholesterol and phytosterol as carbon and energy sources. In this study, a comprehensive bioinformatics genome-wide screening was carried out to predict genes related to steroid metabolism in this organism, their clustering and possible regulation. The predicted operon structure and number of candidate gene copies paralogs have been estimated. Binding sites of steroid catabolism regulators KstR and KstR2 specified for N. simplex VKM Ac-2033D have been calculated de novo. Most of the candidate genes grouped within three main clusters, one of the predicted clusters having no analogs in other actinobacteria studied so far. The results offer a base for further functional studies, expand the understanding of steroid catabolism by actinobacteria, and will contribute to modifying of metabolic pathways in order to generate effective biocatalysts capable of producing valuable bioactive steroids.

  12. Chemogenomics: a discipline at the crossroad of high throughput technologies, biomarker research, combinatorial chemistry, genomics, cheminformatics, bioinformatics and artificial intelligence.

    PubMed

    Maréchal, Eric

    2008-09-01

    Chemogenomics is the study of the interaction of functional biological systems with exogenous small molecules, or in broader sense the study of the intersection of biological and chemical spaces. Chemogenomics requires expertises in biology, chemistry and computational sciences (bioinformatics, cheminformatics, large scale statistics and machine learning methods) but it is more than the simple apposition of each of these disciplines. Biological entities interacting with small molecules can be isolated proteins or more elaborate systems, from single cells to complete organisms. The biological space is therefore analyzed at various postgenomic levels (genomic, transcriptomic, proteomic or any phenotypic level). The space of small molecules is partially real, corresponding to commercial and academic collections of compounds, and partially virtual, corresponding to the chemical space possibly synthesizable. Synthetic chemistry has developed novel strategies allowing a physical exploration of this universe of possibilities. A major challenge of cheminformatics is to charter the virtual space of small molecules using realistic biological constraints (bioavailability, druggability, structural biological information). Chemogenomics is a descendent of conventional pharmaceutical approaches, since it involves the screening of chemolibraries for their effect on biological targets, and benefits from the advances in the corresponding enabling technologies and the introduction of new biological markers. Screening was originally motivated by the rigorous discovery of new drugs, neglecting and throwing away any molecule that would fail to meet the standards required for a therapeutic treatment. It is now the basis for the discovery of small molecules that might or might not be directly used as drugs, but which have an immense potential for basic research, as probes to explore an increasing number of biological phenomena. Concerns about the environmental impact of chemical industry

  13. Implementing personalized cancer genomics in clinical trials.

    PubMed

    Simon, Richard; Roychowdhury, Sameek

    2013-05-01

    The recent surge in high-throughput sequencing of cancer genomes has supported an expanding molecular classification of cancer. These studies have identified putative predictive biomarkers signifying aberrant oncogene pathway activation and may provide a rationale for matching patients with molecularly targeted therapies in clinical trials. Here, we discuss some of the challenges of adapting these data for rare cancers or molecular subsets of certain cancers, which will require aligning the availability of investigational agents, rapid turnaround of clinical grade sequencing, molecular eligibility and reconsidering clinical trial design and end points.

  14. Anticipation of Personal Genomics Data Enhances Interest and Learning Environment in Genomics and Molecular Biology Undergraduate Courses.

    PubMed

    Weber, K Scott; Jensen, Jamie L; Johnson, Steven M

    2015-01-01

    An important discussion at colleges is centered on determining more effective models for teaching undergraduates. As personalized genomics has become more common, we hypothesized it could be a valuable tool to make science education more hands on, personal, and engaging for college undergraduates. We hypothesized that providing students with personal genome testing kits would enhance the learning experience of students in two undergraduate courses at Brigham Young University: Advanced Molecular Biology and Genomics. These courses have an emphasis on personal genomics the last two weeks of the semester. Students taking these courses were given the option to receive personal genomics kits in 2014, whereas in 2015 they were not. Students sent their personal genomics samples in on their own and received the data after the course ended. We surveyed students in these courses before and after the two-week emphasis on personal genomics to collect data on whether anticipation of obtaining their own personal genomic data impacted undergraduate student learning. We also tested to see if specific personal genomic assignments improved the learning experience by analyzing the data from the undergraduate students who completed both the pre- and post-course surveys. Anticipation of personal genomic data significantly enhanced student interest and the learning environment based on the time students spent researching personal genomic material and their self-reported attitudes compared to those who did not anticipate getting their own data. Personal genomics homework assignments significantly enhanced the undergraduate student interest and learning based on the same criteria and a personal genomics quiz. We found that for the undergraduate students in both molecular biology and genomics courses, incorporation of personal genomic testing can be an effective educational tool in undergraduate science education.

  15. The clinical potential and challenges of sequencing cancer genomes for personalized medical genomics.

    PubMed

    Cloonan, Nicole; Waddell, Nic; Grimmond, Sean M

    2010-11-01

    Next-generation sequencing is revolutionizing the way in which genomic-scale biological research is performed, and its effects are beginning to be translated medically. Large-scale international collaborations for the comprehensive sequencing of the genome, epigenome, and transcriptomes of cancers and corresponding 'normal' (germ-line) DNA are heralding the start of personalized medical genomics. The promise of eliminating conjecture when determining treatment approaches is certainly appealing for both patients and clinicians; however, several major issues must be resolved before next-generation sequencing will be adopted as a routine clinical tool for patients. This feature review explores the clinical potential and challenges of studying cancer genomes for personalized medical genomics.

  16. Human genome and the african personality: implications for social work.

    PubMed

    Mickel, Elijah; Miller, Sheila D

    2011-01-01

    The integration of the human genome with the African personality should be viewed as an interdependent whole. The African personality, for purposes of this article, comprises Black experiences, Negritude, and an Africa-centered axiology and epistemology. The outcome results in a spiritual focused collective consciousness. Anthropologically, historically (and with the Human Genome Project), genetically Africa has proven to be the source of all human life. Human kind wherever they exist on the planet using the African personality must be viewed as interconnected. Although racism and its progeny discrimination preexist the human genome project (HGP), the human genome provides an evidence-based rationale for the end to all policy and subsequent practice based on race and racism. Policy must be based on evidence to be competent practice. It would be remiss if not irresponsible of social work and the other behavioral scientist concerned with intervention and prevention behaviors to not infuse the findings of the HCPs. The African personality is a concept that provides a wholistic way to evaluate human behavior from an African worldview.

  17. Re-Examining the Gene in Personalized Genomics

    ERIC Educational Resources Information Center

    Bartol, Jordan

    2013-01-01

    Personalized genomics companies (PG; also called "direct-to-consumer genetics") are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept…

  18. Bioinformatics for Genome Analysis

    SciTech Connect

    Gary J. Olsen

    2005-06-30

    Nesbo, Boucher and Doolittle (2001) used phylogenetic trees of four taxa to assess whether euryarchaeal genes share a common history. They have suggested that of the 521 genes examined, each of the three possible tree topologies relating the four taxa was supported essentially equal numbers of times. They suggest that this might be the result of numerous horizontal gene transfer events, essentially randomizing the relationships between gene histories (as inferred in the 521 gene trees) and organismal relationships (which would be a single underlying tree). Motivated by the fact that the order in which sequences are added to a multiple sequence alignment influences the alignment, and ultimately inferred tree, they were interested in the extent to which the variations among inferred trees might be due to variations in the alignment order. This bears directly on their efforts to evaluate and improve upon methods of multiple sequence alignment. They set out to analyze the influence of alignment order on the tree inferred for 43 genes shared among these same 4 taxa. Because alignments produced by CLUSTALW are directed by a rooted guide tree (the denderogram), there are 15 possible alignment orders of 4 taxa. For each gene they tested all 15 alignment orders, and as a 16th option, allowed CLUSTALW to generate its own guide tree. If we supply all 15 possible rooted guide trees, they expected that at least one of them should be as good at CLUSTAL's own guide tree, but most of the time they differed (sometimes being better than CLUSTAL's default tree and sometimes being worse). The difference seems to be that the user-supplied tree is not given meaningful branch lengths, which effect the assumed probability of amino acid changes. They examined the practicality of modifying CLUSTALW to improve its treatment of user-supplied guide trees. This work became ever increasing bogged down in finding and repairing minor bugs in the CLUSTALW code. This effort was put on hold as we feel that our other proposed approaches will ultimately be better.

  19. Making Personalized Health Care Even More Personalized: Insights From Activities of the IOM Genomics Roundtable.

    PubMed

    David, Sean P; Johnson, Samuel G; Berger, Adam C; Feero, W Gregory; Terry, Sharon F; Green, Larry A; Phillips, Robert L; Ginsburg, Geoffrey S

    2015-01-01

    Genomic research has generated much new knowledge into mechanisms of human disease, with the potential to catalyze novel drug discovery and development, prenatal and neonatal screening, clinical pharmacogenomics, more sensitive risk prediction, and enhanced diagnostics. Genomic medicine, however, has been limited by critical evidence gaps, especially those related to clinical utility and applicability to diverse populations. Genomic medicine may have the greatest impact on health care if it is integrated into primary care, where most health care is received and where evidence supports the value of personalized medicine grounded in continuous healing relationships. Redesigned primary care is the most relevant setting for clinically useful genomic medicine research. Taking insights gained from the activities of the Institute of Medicine (IOM) Roundtable on Translating Genomic-Based Research for Health, we apply lessons learned from the patient-centered medical home national experience to implement genomic medicine in a patient-centered, learning health care system.

  20. Discover hidden splicing variations by mapping personal transcriptomes to personal genomes

    PubMed Central

    Stein, Shayna; Lu, Zhi-xiang; Bahrami-Samani, Emad; Park, Juw Won; Xing, Yi

    2015-01-01

    RNA-seq has become a popular technology for studying genetic variation of pre-mRNA alternative splicing. Commonly used RNA-seq aligners rely on the consensus splice site dinucleotide motifs to map reads across splice junctions. Consequently, genomic variants that create novel splice site dinucleotides may produce splice junction RNA-seq reads that cannot be mapped to the reference genome. We developed and evaluated an approach to identify ‘hidden’ splicing variations in personal transcriptomes, by mapping personal RNA-seq data to personal genomes. Computational analysis and experimental validation indicate that this approach identifies personal specific splice junctions at a low false positive rate. Applying this approach to an RNA-seq data set of 75 individuals, we identified 506 personal specific splice junctions, among which 437 were novel splice junctions not documented in current human transcript annotations. 94 splice junctions had splice site SNPs associated with GWAS signals of human traits and diseases. These involve genes whose splicing variations have been implicated in diseases (such as OAS1), as well as novel associations between alternative splicing and diseases (such as ICA1). Collectively, our work demonstrates that the personal genome approach to RNA-seq read alignment enables the discovery of a large but previously unknown catalog of splicing variations in human populations. PMID:26578562

  1. Education and personalized genomics: deciphering the public's genetic health report

    PubMed Central

    Lamb, Neil E; Myers, Richard M; Gunter, Chris

    2010-01-01

    Where do members of the public turn to understand what genetic tests mean in terms of their own health? Now that genome-wide association studies and complete genome sequencing are widely available, the importance of education in personalized genomics cannot be overstated. Although some media have introduced the concept of genetic testing to better understand health and disease, the public's understanding of the scope and impact of genetic variation has not kept up with the pace of the science or technology. Unfortunately, the likely sources to which the public turn to for guidance – their physician and the media – are often no better prepared. We examine several venues for information, including print and online guides for both lay and health-oriented audiences, and summarize selected resources in multiple formats. We also note on the roadblocks to progress and discuss ways to remove them, as urgent action is needed to connect people with their genomes in a meaningful way. PMID:20161675

  2. hfAIM: A reliable bioinformatics approach for in silico genome-wide identification of autophagy-associated Atg8-interacting motifs in various organisms.

    PubMed

    Xie, Qingjun; Tzfadia, Oren; Levy, Matan; Weithorn, Efrat; Peled-Zehavi, Hadas; Van Parys, Thomas; Van de Peer, Yves; Galili, Gad

    2016-05-03

    Most of the proteins that are specifically turned over by selective autophagy are recognized by the presence of short Atg8 interacting motifs (AIMs) that facilitate their association with the autophagy apparatus. Such AIMs can be identified by bioinformatics methods based on their defined degenerate consensus F/W/Y-X-X-L/I/V sequences in which X represents any amino acid. Achieving reliability and/or fidelity of the prediction of such AIMs on a genome-wide scale represents a major challenge. Here, we present a bioinformatics approach, high fidelity AIM (hfAIM), which uses additional sequence requirements-the presence of acidic amino acids and the absence of positively charged amino acids in certain positions-to reliably identify AIMs in proteins. We demonstrate that the use of the hfAIM method allows for in silico high fidelity prediction of AIMs in AIM-containing proteins (ACPs) on a genome-wide scale in various organisms. Furthermore, by using hfAIM to identify putative AIMs in the Arabidopsis proteome, we illustrate a potential contribution of selective autophagy to various biological processes. More specifically, we identified 9 peroxisomal PEX proteins that contain hfAIM motifs, among which AtPEX1, AtPEX6 and AtPEX10 possess evolutionary-conserved AIMs. Bimolecular fluorescence complementation (BiFC) results verified that AtPEX6 and AtPEX10 indeed interact with Atg8 in planta. In addition, we show that mutations occurring within or nearby hfAIMs in PEX1, PEX6 and PEX10 caused defects in the growth and development of various organisms. Taken together, the above results suggest that the hfAIM tool can be used to effectively perform genome-wide in silico screens of proteins that are potentially regulated by selective autophagy. The hfAIM system is a web tool that can be accessed at link: http://bioinformatics.psb.ugent.be/hfAIM/.

  3. Identification of conserved and polymorphic STRs for personal genomes

    PubMed Central

    2014-01-01

    Background Short tandem repeats (STRs) are abundant in human genomes. Numerous STRs have been shown to be associated with genetic diseases and gene regulatory functions, and have been selected as genetic markers for evolutionary and forensic analyses. High-throughput next generation sequencers have fostered new cutting-edge computing techniques for genome-scale analyses, and cross-genome comparisons have facilitated the efficient identification of polymorphic STR markers for various applications. Results An automated and efficient system for detecting human polymorphic STRs at the genome scale is proposed in this study. Assembled contigs from next generation sequencing data were aligned and calibrated according to selected reference sequences. To verify identified polymorphic STRs, human genomes from the 1000 Genomes Project were employed for comprehensive analyses, and STR markers from the Combined DNA Index System (CODIS) and disease-related STR motifs were also applied as cases for evaluation. In addition, we analyzed STR variations for highly conserved homologous genes and human-unique genes. In total 477 polymorphic STRs were identified from 492 human-unique genes, among which 26 STRs were retrieved and clustered into three different groups for efficient comparison. Conclusions We have developed an online system that efficiently identifies polymorphic STRs and provides novel distinguishable STR biomarkers for different levels of specificity. Candidate polymorphic STRs within a personal genome could be easily retrieved and compared to the constructed STR profile through query keywords, gene names, or assembled contigs. PMID:25560225

  4. Harvard Personal Genome Project: lessons from participatory public research

    PubMed Central

    2014-01-01

    Background Since its initiation in 2005, the Harvard Personal Genome Project has enrolled thousands of volunteers interested in publicly sharing their genome, health and trait data. Because these data are highly identifiable, we use an ‘open consent’ framework that purposefully excludes promises about privacy and requires participants to demonstrate comprehension prior to enrollment. Discussion Our model of non-anonymous, public genomes has led us to a highly participatory model of researcher-participant communication and interaction. The participants, who are highly committed volunteers, self-pursue and donate research-relevant datasets, and are actively engaged in conversations with both our staff and other Personal Genome Project participants. We have quantitatively assessed these communications and donations, and report our experiences with returning research-grade whole genome data to participants. We also observe some of the community growth and discussion that has occurred related to our project. Summary We find that public non-anonymous data is valuable and leads to a participatory research model, which we encourage others to consider. The implementation of this model is greatly facilitated by web-based tools and methods and participant education. Project results are long-term proactive participant involvement and the growth of a community that benefits both researchers and participants. PMID:24713084

  5. Defining personal utility in genomics: A Delphi study.

    PubMed

    Kohler, J; Turbitt, E; Lewis, K L; Wilfond, B S; Jamal, L; Peay, H L; Biesecker, L G; Biesecker, B B

    2017-02-20

    Individual genome sequencing results are valued by patients in ways distinct from clinical utility. Such outcomes have been described as components of "personal utility," a concept that broadly encompasses patient-endorsed benefits, that is operationally defined as non-clinical outcomes. No empirical delineation of these outcomes has been reported. To address this gap, we administered a Delphi survey to adult participants in a NIH clinical exome study to extract the most highly endorsed outcomes constituting personal utility. Forty research participants responded to a Delphi survey to rate 35 items identified by a systematic literature review of personal utility. Two rounds of ranking resulted in 24 items that represented 14 distinct elements of personal utility. Elements most highly endorsed by participants were: increased self-knowledge, knowledge of "the condition," altruism, and anticipated coping. Our findings represent the first systematic effort to delineate elements of personal utility that may be used to anticipate participant expectation and inform genetic counseling prior to sequencing. The 24 items reported need to be studied further in additional clinical genome sequencing studies to assess generalizability in other populations. Further research will help to understand motivations and to predict the meaning and use of results.

  6. Personalized Genomic Medicine and the Rhetoric of Empowerment

    PubMed Central

    Juengst, Eric T.; Flatt, Michael A.; Settersten, Richard A.

    2013-01-01

    Advocates of “personalized” genomic medicine maintain that it is revolutionary not just in what it can reveal to us, but in how it will enable us to take control of our health. But we should not assume that patient empowerment always yields positive outcomes. To assess the social impact of personalized medicine, we must anticipate how the virtue might go awry in practice. PMID:22976411

  7. Biology in 'silico': The Bioinformatics Revolution.

    ERIC Educational Resources Information Center

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  8. SpeedSeq: Ultra-fast personal genome analysis and interpretation

    PubMed Central

    Chiang, Colby; Layer, Ryan M.; Faust, Gregory G.; Lindberg, Michael R.; Rose, David B.; Garrison, Erik P.; Marth, Gabor T.; Quinlan, Aaron R.; Hall, Ira M.

    2015-01-01

    SpeedSeq is an open-source genome analysis platform that accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 hours on a low-cost server, alleviating a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement. SpeedSeq offers competitive or superior performance to current methods for detecting germline and somatic single nucleotide variants, indels, and structural variants, and includes novel functionality for streamlined interpretation. PMID:26258291

  9. Illuminating the Black Box of Genome Sequence Assembly: A Free Online Tool to Introduce Students to Bioinformatics

    ERIC Educational Resources Information Center

    Taylor, D. Leland; Campbell, A. Malcolm; Heyer, Laurie J.

    2013-01-01

    Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken into fragments and sequenced, producing millions of "reads." A computer algorithm pieces these reads together in the genome assembly process. PHAST is a set of online modules…

  10. iCAGES: integrated CAncer GEnome Score for comprehensively prioritizing driver genes in personal cancer genomes.

    PubMed

    Dong, Chengliang; Guo, Yunfei; Yang, Hui; He, Zeyu; Liu, Xiaoming; Wang, Kai

    2016-12-22

    Cancer results from the acquisition of somatic driver mutations. Several computational tools can predict driver genes from population-scale genomic data, but tools for analyzing personal cancer genomes are underdeveloped. Here we developed iCAGES, a novel statistical framework that infers driver variants by integrating contributions from coding, non-coding, and structural variants, identifies driver genes by combining genomic information and prior biological knowledge, then generates prioritized drug treatment. Analysis on The Cancer Genome Atlas (TCGA) data showed that iCAGES predicts whether patients respond to drug treatment (P = 0.006 by Fisher's exact test) and long-term survival (P = 0.003 from Cox regression). iCAGES is available at http://icages.wglab.org .

  11. MISIS-2: A bioinformatics tool for in-depth analysis of small RNAs and representation of consensus master genome in viral quasispecies.

    PubMed

    Seguin, Jonathan; Otten, Patricia; Baerlocher, Loïc; Farinelli, Laurent; Pooggin, Mikhail M

    2016-07-01

    In most eukaryotes, small RNA (sRNA) molecules such as miRNAs, siRNAs and piRNAs regulate gene expression and repress transposons and viruses. AGO/PIWI family proteins sort functional sRNAs based on size, 5'-nucleotide and other sequence features. In plants and some animals, viral sRNAs are extremely diverse and cover the entire viral genome sequences, which allows for de novo reconstruction of a complete viral genome by deep sequencing and bioinformatics analysis of viral sRNAs. Previously, we have developed a tool MISIS to view and analyze sRNA maps of viruses and cellular genome regions which spawn multiple sRNAs. Here we describe a new release of MISIS, MISIS-2, which enables to determine and visualize a consensus sequence and count sRNAs of any chosen sizes and 5'-terminal nucleotide identities. Furthermore we demonstrate the utility of MISIS-2 for identification of single nucleotide polymorphisms (SNPs) at each position of a reference sequence and reconstruction of a consensus master genome in evolving viral quasispecies. MISIS-2 is a Java standalone program. It is freely available along with the source code at the website http://www.fasteris.com/apps.

  12. Bioinformatics Pipelines for Targeted Resequencing and Whole-Exome Sequencing of Human and Mouse Genomes: A Virtual Appliance Approach for Instant Deployment

    PubMed Central

    Saeed, Isaam; Wong, Stephen Q.; Mar, Victoria; Goode, David L.; Caramia, Franco; Doig, Ken; Ryland, Georgina L.; Thompson, Ella R.; Hunter, Sally M.; Halgamuge, Saman K.; Ellul, Jason; Dobrovic, Alexander; Campbell, Ian G.; Papenfuss, Anthony T.; McArthur, Grant A.; Tothill, Richard W.

    2014-01-01

    Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/. PMID:24752294

  13. Genome-Based Bioinformatic Selection of Chromosomal Bacillus anthracis Putative Vaccine Candidates Coupled with Proteomic Identification of Surface-Associated Antigens

    PubMed Central

    Ariel, N.; Zvi, A.; Makarova, K. S.; Chitlaru, T.; Elhanany, E.; Velan, B.; Cohen, S.; Friedlander, A. M.; Shafferman, A.

    2003-01-01

    Bacillus anthracis (Ames strain) chromosome-derived open reading frames (ORFs), predicted to code for surface exposed or virulence related proteins, were selected as B. anthracis-specific vaccine candidates by a multistep computational screen of the entire draft chromosome sequence (February 2001 version, 460 contigs, The Institute for Genomic Research, Rockville, Md.). The selection procedure combined preliminary annotation (sequence similarity searches and domain assignments), prediction of cellular localization, taxonomical and functional screen and additional filtering criteria (size, number of paralogs). The reductive strategy, combined with manual curation, resulted in selection of 240 candidate ORFs encoding proteins with putative known function, as well as 280 proteins of unknown function. Proteomic analysis of two-dimensional gels of a B. anthracis membrane fraction, verified the expression of some gene products. Matrix-assisted laser desorption ionization-time-of-flight mass spectrometry analyses allowed identification of 38 spots cross-reacting with sera from B. anthracis immunized animals. These spots were found to represent eight in vivo immunogens, comprising of EA1, Sap, and 6 proteins whose expression and immunogenicity was not reported before. Five of these 8 immunogens were preselected by the bioinformatic analysis (EA1, Sap, 2 novel SLH proteins and peroxiredoxin/AhpC), as vaccine candidates. This study demonstrates that a combination of the bioinformatic and proteomic strategies may be useful in promoting the development of next generation anthrax vaccine. PMID:12874336

  14. Omics technologies, data and bioinformatics principles.

    PubMed

    Schneider, Maria V; Orchard, Sandra

    2011-01-01

    We provide an overview on the state of the art for the Omics technologies, the types of omics data and the bioinformatics resources relevant and related to Omics. We also illustrate the bioinformatics challenges of dealing with high-throughput data. This overview touches several fundamental aspects of Omics and bioinformatics: data standardisation, data sharing, storing Omics data appropriately and exploring Omics data in bioinformatics. Though the principles and concepts presented are true for the various different technological fields, we concentrate in three main Omics fields namely: genomics, transcriptomics and proteomics. Finally we address the integration of Omics data, and provide several useful links for bioinformatics and Omics.

  15. Re-examining the Gene in Personalized Genomics

    NASA Astrophysics Data System (ADS)

    Bartol, Jordan

    2013-10-01

    Personalized genomics companies (PG; also called `direct-to-consumer genetics') are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept presented to customers and the relation between the information given and the science behind PG. Two quite different gene concepts are present in company rhetoric, but only one features in the science. To explain this, we must appreciate the delicate tension between PG, academic science, public expectation, and market forces.

  16. Enabling personal genomics with an explicit test of epistasis.

    PubMed

    Greene, Casey S; Himmelstein, Daniel S; Nelson, Heather H; Kelsey, Karl T; Williams, Scott M; Andrew, Angeline S; Karagas, Margaret R; Moore, Jason H

    2010-01-01

    One goal of personal genomics is to use information about genomic variation to predict who is at risk for various common diseases. Technological advances in genotyping have spawned several personal genetic testing services that market genotyping services directly to the consumer. An important goal of consumer genetic testing is to provide health information along with the genotyping results. This has the potential to integrate detailed personal genetic and genomic information into healthcare decision making. Despite the potential importance of these advances, there are some important limitations. One concern is that much of the literature that is used to formulate personal genetics reports is based on genetic association studies that consider each genetic variant independently of the others. It is our working hypothesis that the true value of personal genomics will only be realized when the complexity of the genotype-to-phenotype mapping relationship is embraced, rather than ignored. We focus here on complexity in genetic architecture due to epistasis or nonlinear gene-gene interaction. We have previously developed a multifactor dimensionality reduction (MDR) algorithm and software package for detecting nonlinear interactions in genetic association studies. In most prior MDR analyses, the permutation testing strategy used to assess statistical significance was unable to differentiate MDR models that captured only interaction effects from those that also detected independent main effects. Statistical interpretation of MDR models required post-hoc analysis using entropy-based measures of interaction information. We introduce here a novel permutation test that allows the effects of nonlinear interactions between multiple genetic variants to be specifically tested in a manner that is not confounded by linear additive effects. We show using simulated nonlinear interactions that the power using the explicit test of epistasis is no different than a standard permutation

  17. Genomic resources for a commercial flatfish, the Senegalese sole (Solea senegalensis): EST sequencing, oligo microarray design, and development of the Soleamold bioinformatic platform

    PubMed Central

    Cerdà, Joan; Mercadé, Jaume; Lozano, Juan José; Manchado, Manuel; Tingaud-Sequeira, Angèle; Astola, Antonio; Infante, Carlos; Halm, Silke; Viñas, Jordi; Castellana, Barbara; Asensio, Esther; Cañavate, Pedro; Martínez-Rodríguez, Gonzalo; Piferrer, Francesc; Planas, Josep V; Prat, Francesc; Yúfera, Manuel; Durany, Olga; Subirada, Francesc; Rosell, Elisabet; Maes, Tamara

    2008-01-01

    Background The Senegalese sole, Solea senegalensis, is a highly prized flatfish of growing commercial interest for aquaculture in Southern Europe. However, despite the industrial production of Senegalese sole being hampered primarily by lack of information on the physiological mechanisms involved in reproduction, growth and immunity, very limited genomic information is available on this species. Results Sequencing of a S. senegalensis multi-tissue normalized cDNA library, from adult tissues (brain, stomach, intestine, liver, ovary, and testis), larval stages (pre-metamorphosis, metamorphosis), juvenile stages (post-metamorphosis, abnormal fish), and undifferentiated gonads, generated 10,185 expressed sequence tags (ESTs). Clones were sequenced from the 3'-end to identify isoform specific sequences. Assembly of the entire EST collection into contigs gave 5,208 unique sequences of which 1,769 (34%) had matches in GenBank, thus showing a low level of redundancy. The sequence of the 5,208 unigenes was used to design and validate an oligonucleotide microarray representing 5,087 unique Senegalese sole transcripts. Finally, a novel interactive bioinformatic platform, Soleamold, was developed for the Senegalese sole EST collection as well as microarray and ISH data. Conclusion New genomic resources have been developed for S. senegalensis, an economically important fish in aquaculture, which include a collection of expressed genes, an oligonucleotide microarray, and a publicly available bioinformatic platform that can be used to study gene expression in this species. These resources will help elucidate transcriptional regulation in wild and captive Senegalese sole for optimization of its production under intensive culture conditions. PMID:18973667

  18. Genomics and Bioinformatics in Undergraduate Curricula: Contexts for Hybrid Laboratory/Lecture Courses for Entering and Advanced Science Students

    ERIC Educational Resources Information Center

    Temple, Louise; Cresawn, Steven G.; Monroe, Jonathan D.

    2010-01-01

    Emerging interest in genomics in the scientific community prompted biologists at James Madison University to create two courses at different levels to modernize the biology curriculum. The courses are hybrids of classroom and laboratory experiences. An upper level class uses raw sequence of a genome (plasmid or virus) as the subject on which to…

  19. Playing with heart and soul…and genomes: sports implications and applications of personal genomics

    PubMed Central

    2013-01-01

    Whether the integration of genetic/omic technologies in sports contexts will facilitate player success, promote player safety, or spur genetic discrimination depends largely upon the game rules established by those currently designing genomic sports medicine programs. The integration has already begun, but there is not yet a playbook for best practices. Thus far discussions have focused largely on whether the integration would occur and how to prevent the integration from occurring, rather than how it could occur in such a way that maximizes benefits, minimizes risks, and avoids the exacerbation of racial disparities. Previous empirical research has identified members of the personal genomics industry offering sports-related DNA tests, and previous legal research has explored the impact of collective bargaining in professional sports as it relates to the employment protections of the Genetic Information Nondiscrimination Act (GINA). Building upon that research and upon participant observations with specific sports-related DNA tests purchased from four direct-to-consumer companies in 2011 and broader personal genomics (PGx) services, this anthropological, legal, and ethical (ALE) discussion highlights fundamental issues that must be addressed by those developing personal genomic sports medicine programs, either independently or through collaborations with commercial providers. For example, the vulnerability of student-athletes creates a number of issues that require careful, deliberate consideration. More broadly, however, this ALE discussion highlights potential sports-related implications (that ultimately might mitigate or, conversely, exacerbate racial disparities among athletes) of whole exome/genome sequencing conducted by biomedical researchers and clinicians for non-sports purposes. For example, the possibility that exome/genome sequencing of individuals who are considered to be non-patients, asymptomatic, normal, etc. will reveal the presence of variants of

  20. Bioinformatic analysis reveals genome size reduction and the emergence of tyrosine phosphorylation site in the movement protein of New World bipartite begomoviruses.

    PubMed

    Ho, Eric S; Kuchie, Joan; Duffy, Siobain

    2014-01-01

    Begomovirus (genus Begomovirus, family Geminiviridae) infection is devastating to a wide variety of agricultural crops including tomato, squash, and cassava. Thus, understanding the replication and adaptation of begomoviruses has important translational value in alleviating substantial economic loss, particularly in developing countries. The bipartite genome of begomoviruses prevalent in the New World and their counterparts in the Old World share a high degree of genome homology except for a partially overlapping reading frame encoding the pre-coat protein (PCP, or AV2). PCP contributes to the essential functions of intercellular movement and suppression of host RNA silencing, but it is only present in the Old World viruses. In this study, we analyzed a set of non-redundant bipartite begomovirus genomes originating from the Old World (N = 28) and the New World (N = 65). Our bioinformatic analysis suggests ∼ 120 nucleotides were deleted from PCP's proximal promoter region that may have contributed to its loss in the New World viruses. Consequently, genomes of the New World viruses are smaller than the Old World counterparts, possibly compensating for the loss of the intercellular movement functions of PCP. Additionally, we detected substantial purifying selection on a portion of the New World DNA-B movement protein (MP, or BC1). Further analysis of the New World MP gene revealed the emergence of a putative tyrosine phosphorylation site, which likely explains the increased purifying selection in that region. These findings provide important information about the strategies adopted by bipartite begomoviruses in adapting to new environment and suggest future in planta experiments.

  1. Bioinformatic Analysis Reveals Genome Size Reduction and the Emergence of Tyrosine Phosphorylation Site in the Movement Protein of New World Bipartite Begomoviruses

    PubMed Central

    Ho, Eric S.; Kuchie, Joan; Duffy, Siobain

    2014-01-01

    Begomovirus (genus Begomovirus, family Geminiviridae) infection is devastating to a wide variety of agricultural crops including tomato, squash, and cassava. Thus, understanding the replication and adaptation of begomoviruses has important translational value in alleviating substantial economic loss, particularly in developing countries. The bipartite genome of begomoviruses prevalent in the New World and their counterparts in the Old World share a high degree of genome homology except for a partially overlapping reading frame encoding the pre-coat protein (PCP, or AV2). PCP contributes to the essential functions of intercellular movement and suppression of host RNA silencing, but it is only present in the Old World viruses. In this study, we analyzed a set of non-redundant bipartite begomovirus genomes originating from the Old World (N = 28) and the New World (N = 65). Our bioinformatic analysis suggests ∼120 nucleotides were deleted from PCP’s proximal promoter region that may have contributed to its loss in the New World viruses. Consequently, genomes of the New World viruses are smaller than the Old World counterparts, possibly compensating for the loss of the intercellular movement functions of PCP. Additionally, we detected substantial purifying selection on a portion of the New World DNA-B movement protein (MP, or BC1). Further analysis of the New World MP gene revealed the emergence of a putative tyrosine phosphorylation site, which likely explains the increased purifying selection in that region. These findings provide important information about the strategies adopted by bipartite begomoviruses in adapting to new environment and suggest future in planta experiments. PMID:25383632

  2. From Wet‐Lab to Variations: Concordance and Speed of Bioinformatics Pipelines for Whole Genome and Whole Exome Sequencing

    PubMed Central

    Laurie, Steve; Fernandez‐Callejo, Marcos; Marco‐Sola, Santiago; Trotta, Jean‐Remi; Camps, Jordi; Chacón, Alejandro; Espinosa, Antonio; Gut, Marta; Gut, Ivo; Heath, Simon

    2016-01-01

    ABSTRACT As whole genome sequencing becomes cheaper and faster, it will progressively substitute targeted next‐generation sequencing as standard practice in research and diagnostics. However, computing cost–performance ratio is not advancing at an equivalent rate. Therefore, it is essential to evaluate the robustness of the variant detection process taking into account the computing resources required. We have benchmarked six combinations of state‐of‐the‐art read aligners (BWA‐MEM and GEM3) and variant callers (FreeBayes, GATK HaplotypeCaller, SAMtools) on whole genome and whole exome sequencing data from the NA12878 human sample. Results have been compared between them and against the NIST Genome in a Bottle (GIAB) variants reference dataset. We report differences in speed of up to 20 times in some steps of the process and have observed that SNV, and to a lesser extent InDel, detection is highly consistent in 70% of the genome. SNV, and especially InDel, detection is less reliable in 20% of the genome, and almost unfeasible in the remaining 10%. These findings will aid in choosing the appropriate tools bearing in mind objectives, workload, and computing infrastructure available. PMID:27604516

  3. From Wet-Lab to Variations: Concordance and Speed of Bioinformatics Pipelines for Whole Genome and Whole Exome Sequencing.

    PubMed

    Laurie, Steve; Fernandez-Callejo, Marcos; Marco-Sola, Santiago; Trotta, Jean-Remi; Camps, Jordi; Chacón, Alejandro; Espinosa, Antonio; Gut, Marta; Gut, Ivo; Heath, Simon; Beltran, Sergi

    2016-12-01

    As whole genome sequencing becomes cheaper and faster, it will progressively substitute targeted next-generation sequencing as standard practice in research and diagnostics. However, computing cost-performance ratio is not advancing at an equivalent rate. Therefore, it is essential to evaluate the robustness of the variant detection process taking into account the computing resources required. We have benchmarked six combinations of state-of-the-art read aligners (BWA-MEM and GEM3) and variant callers (FreeBayes, GATK HaplotypeCaller, SAMtools) on whole genome and whole exome sequencing data from the NA12878 human sample. Results have been compared between them and against the NIST Genome in a Bottle (GIAB) variants reference dataset. We report differences in speed of up to 20 times in some steps of the process and have observed that SNV, and to a lesser extent InDel, detection is highly consistent in 70% of the genome. SNV, and especially InDel, detection is less reliable in 20% of the genome, and almost unfeasible in the remaining 10%. These findings will aid in choosing the appropriate tools bearing in mind objectives, workload, and computing infrastructure available.

  4. Bioinformatic Analyses of Unique (Orphan) Core Genes of the Genus Acidithiobacillus: Functional Inferences and Use As Molecular Probes for Genomic and Metagenomic/Transcriptomic Interrogation

    PubMed Central

    González, Carolina; Lazcano, Marcelo; Valdés, Jorge; Holmes, David S.

    2016-01-01

    Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia. These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e−5. None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus, making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD). PMID:28082953

  5. Genome- wide characterization of Nuclear Factor Y (NF-Y) gene family of sorghum [Sorghum bicolor (L.) Moench]: a bioinformatics approach.

    PubMed

    Malviya, Neha; Jaiswal, Parul; Yadav, Dinesh

    2016-01-01

    Nuclear factor Y (NF-Y) is a heterotrimeric transcription factor (TF) complex with preferential binding to CCAAT elements of promoters, regulating gene expression in most of the higher eukaryotes. The availability of plant genome sequences have revealed multiple number of genes coding for the three subunits, namely NF-YA, NF-YB and NF-YC in contrast to single NF-Y gene for each subunit reported in yeast and animals. A total of 33 NF-YTF comprising of 8 NF-YA, 11 NF-YB and 14 NF-YC subunits were accessed from the sorghum genome. The bioinformatic characterization of NF-Y gene family of sorghum for gene structure, chromosome location, protein motif, phylogeny, gene duplication and in-silico expression under abiotic stresses have been attempted in the present study. The identified SbNF-Y genes are distributed on all the 10 chromosomes of sorghum with variability in the frequency and 18 out of 33 SbNF-Ys were found to be intronless. Segmental duplication event was found to be predominant feature based on gene duplication pattern study. Several orthologs and paralogs groups were disclosed through the comprehensive phylogenetic analysis of SbNF-Y proteins along with 36 Arabidopsis and 28 rice NF-Y proteins. In-silico expression analysis under abiotic stresses using rice transcriptome data revealed several of the sorghum NF-Y genes to be associated with salt, drought, cold and heat stresses.

  6. Web-based Gene Pathogenicity Analysis (WGPA): a web platform to interpret gene pathogenicity from personal genome data

    PubMed Central

    Diaz-Montana, Juan J.; Rackham, Owen J.L.; Diaz-Diaz, Norberto; Petretto, Enrico

    2016-01-01

    Summary: As the volume of patient-specific genome sequences increases the focus of biomedical research is switching from the detection of disease-mutations to their interpretation. To this end a number of techniques have been developed that use mutation data collected within a population to predict whether individual genes are likely to be disease-causing or not. As both sequence data and associated analysis tools proliferate, it becomes increasingly difficult for the community to make sense of these data and their implications. Moreover, no single analysis tool is likely to capture all relevant genomic features that contribute to the gene’s pathogenicity. Here, we introduce Web-based Gene Pathogenicity Analysis (WGPA), a web-based tool to analyze genes impacted by mutations and rank them through the integration of existing prioritization tools, which assess different aspects of gene pathogenicity using population-level sequence data. Additionally, to explore the polygenic contribution of mutations to disease, WGPA implements gene set enrichment analysis to prioritize disease-causing genes and gene interaction networks, therefore providing a comprehensive annotation of personal genomes data in disease. Availability and implementation: wgpa.systems-genetics.net Contact: enrico.petretto@duke-nus.edu.sg Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26490503

  7. Whole Genome Sequencing and a New Bioinformatics Platform Allow for Rapid Gene Identification in D. melanogaster EMS Screens

    PubMed Central

    Gonzalez, Michael A.; Van Booven, Derek; Hulme, William; Ulloa, Rick H.; Lebrigio, Rafael F. Acosta; Osterloh, Jeannette; Logan, Mary; Freeman, Marc; Zuchner, Stephan

    2012-01-01

    Forward genetic screens in Drosophila melanogaster using ethyl methanesulfonate (EMS) mutagenesis are a powerful approach for identifying genes that modulate specific biological processes in an in vivo setting. The mapping of genes that contain randomly-induced point mutations has become more efficient in Drosophila thanks to the maturation and availability of many types of genetic tools. However, classic approaches to gene mapping are relatively slow and ultimately require extensive Sanger sequencing of candidate chromosomal loci. With the advent of new high-throughput sequencing techniques, it is increasingly efficient to directly re-sequence the whole genome of model organisms. This approach, in combination with traditional chromosomal mapping, has the potential to greatly simplify and accelerate mutation identification in mutants generated in EMS screens. Here we show that next-generation sequencing (NGS) is an accurate and efficient tool for high-throughput sequencing and mutation discovery in Drosophila melanogaster. As a test case, mutant strains of Drosophila that exhibited long-term survival of severed peripheral axons were identified in a forward EMS mutagenesis. All mutants were recessive and fell into a single lethal complementation group, which suggested that a single gene was responsible for the protective axon degenerative phenotype. Whole genome sequencing of these genomes identified the underlying gene ect4. To improve the process of genome wide mutation identification, we developed Genomes Management Application (GEM.app, https://genomics.med.miami.edu), a graphical online user interface to a custom query framework. Using a custom GEM.app query, we were able to identify that each mutant carried a unique non-sense mutation in the gene ect4 (dSarm), which was recently shown by Osterloh et al. to be essential for the activation of axonal degeneration. Our results demonstrate the current advantages and limitations of NGS in Drosophila and we introduce

  8. Hands-on Workshops as An Effective Means of Learning Advanced Technologies Including Genomics, Proteomics and Bioinformatics

    PubMed Central

    Reisdorph, Nichole; Stearman, Robert; Kechris, Katerina; Phang, Tzu Lip; Reisdorph, Richard; Prenni, Jessica; Erle, David J.; Coldren, Christopher; Schey, Kevin; Nesvizhskii, Alexey; Geraci, Mark

    2013-01-01

    Genomics and proteomics have emerged as key technologies in biomedical research, resulting in a surge of interest in training by investigators keen to incorporate these technologies into their research. At least two types of training can be envisioned in order to produce meaningful results, quality publications and successful grant applications: (1) immediate short-term training workshops and (2) long-term graduate education or visiting scientist programs. We aimed to fill the former need by providing a comprehensive hands-on training course in genomics, proteomics and informatics in a coherent, experimentally-based framework. This was accomplished through a National Heart, Lung, and Blood Institute (NHLBI)-sponsored 10-day Genomics and Proteomics Hands-on Workshop held at National Jewish Health (NJH) and the University of Colorado School of Medicine (UCD). The course content included comprehensive lectures and laboratories in mass spectrometry and genomics technologies, extensive hands-on experience with instrumentation and software, video demonstrations, optional workshops, online sessions, invited keynote speakers, and local and national guest faculty. Here we describe the detailed curriculum and present the results of short- and long-term evaluations from course attendees. Our educational program consistently received positive reviews from participants and had a substantial impact on grant writing and review, manuscript submissions and publications. PMID:24316330

  9. Hands-on workshops as an effective means of learning advanced technologies including genomics, proteomics and bioinformatics.

    PubMed

    Reisdorph, Nichole; Stearman, Robert; Kechris, Katerina; Phang, Tzu Lip; Reisdorph, Richard; Prenni, Jessica; Erle, David J; Coldren, Christopher; Schey, Kevin; Nesvizhskii, Alexey; Geraci, Mark

    2013-12-01

    Genomics and proteomics have emerged as key technologies in biomedical research, resulting in a surge of interest in training by investigators keen to incorporate these technologies into their research. At least two types of training can be envisioned in order to produce meaningful results, quality publications and successful grant applications: (1) immediate short-term training workshops and (2) long-term graduate education or visiting scientist programs. We aimed to fill the former need by providing a comprehensive hands-on training course in genomics, proteomics and informatics in a coherent, experimentally-based framework. This was accomplished through a National Heart, Lung, and Blood Institute (NHLBI)-sponsored 10-day Genomics and Proteomics Hands-on Workshop held at National Jewish Health (NJH) and the University of Colorado School of Medicine (UCD). The course content included comprehensive lectures and laboratories in mass spectrometry and genomics technologies, extensive hands-on experience with instrumentation and software, video demonstrations, optional workshops, online sessions, invited keynote speakers, and local and national guest faculty. Here we describe the detailed curriculum and present the results of short- and long-term evaluations from course attendees. Our educational program consistently received positive reviews from participants and had a substantial impact on grant writing and review, manuscript submissions and publications.

  10. Fuzzy Logic in Medicine and Bioinformatics

    PubMed Central

    Torres, Angela; Nieto, Juan J.

    2006-01-01

    The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions) and in bioinformatics (comparison of genomes). PMID:16883057

  11. Rapid Development of Bioinformatics Education in China

    ERIC Educational Resources Information Center

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related…

  12. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  13. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes.

  14. Genome-wide association study of antisocial personality disorder

    PubMed Central

    Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

    2016-01-01

    The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53–3.14), P=1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37–1.85), P=1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder. PMID:27598967

  15. Crowdsourcing for bioinformatics

    PubMed Central

    Good, Benjamin M.; Su, Andrew I.

    2013-01-01

    Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614

  16. Relationship between microRNA genes incidence and cancer-associated genomic regions in canine tumors: a comprehensive bioinformatics study.

    PubMed

    Zamani-Ahmadmahmudi, Mohamad

    2016-03-01

    The role of microRNAs (miRNAs) in human cancer biology has been confirmed on a genome-wide scale through the high incidence of these genes in cancer-associated regions. We analyzed the association between canine miRNA genes and cancer-associated regions (deleted and amplified regions) using previously published array of comparative genomic hybridization data on 268 canine cancer samples-comprising osteosarcoma, breast cancer, leukemia, and colorectal cancer. We also assessed this relationship apropos the incidence of miRNA genes in the CpG islands of the canine genome assembly. The association was evaluated using the mixed-effects Poisson regression analysis. Our analyses revealed that 135 miRNA genes were exactly located in the aberrated regions: 77 (57 %) in the loss and 58 (43 %) in amplified regions. Our findings indicated that the miRNA genes were located more frequently in the deleted regions as well as in the CpG islands than in all other regions. Additionally, with the exception of leukemia, the amplified regions significantly contained higher numbers of miRNA genes than did all the other regions.

  17. The Characteristics of Rare Codon Clusters in the Genome and Proteins of Hepatitis C Virus; a Bioinformatics Look

    PubMed Central

    Fattahi, Mohammadreza; Malekpour, Abdorrasoul; Mortazavi, Mojtaba; Safarpour, Alireza; Naseri, Nasrin

    2014-01-01

    BACKGROUND Recent studies suggest that rare codon clusters are functionally important for protein activity. METHODS Here, for the first time we analyzed and reported rare codon clusters in Hepatitis C Virus (HCV) genome and then identified the location of these rare codon clusters in the structure of HCV protein. This analysis was performed using the Sherlocc program that detects statistically relevant conserved rare codon clusters. RESULTS By this program, we identified the rare codon cluster in three regions of HCV genome; NS2, NS3, and NS5A coding sequence of HCV genome. For further understanding of the role of these rare codon clusters, we studied the location of these rare codon clusters and critical residues in the structure of NS2, NS3 and NS5A proteins. We identified some critical residues near or within rare codon clusters. It should be mentioned that characteristics of these critical residues such as location and situation of side chains are important in assurance of the HCV life cycle. CONCLUSION The characteristics of these residues and their relative status showed that these rare codon clusters play an important role in proper folding of these proteins. Thus, it is likely that these rare codon clusters may have an important role in the function of HCV proteins. This information is helpful in development of new avenues for vaccine and treatment protocols. PMID:25349685

  18. Genome-wide identification and evolutionary analysis of algal LPAT genes involved in TAG biosynthesis using bioinformatic approaches.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar

    2014-12-01

    Lysophosphatidyl acyltransferase (LPAT) is one of the major triacylglycerol synthesis enzymes, controlling the metabolic flow of lysophosphatidic acid to phosphatidic acid. Experimental studies in Arabidopsis have shown that LPAT activity is exhibited primarily by three distinct isoforms, namely the plastid-located LPAT1, the endoplasmic reticulum-located LPAT2, and the soluble isoform of LPAT (solLPAT). In this study, 24 putative genes representing all LPAT isoforms were identified from the analysis of 11 complete genomes including green algae, red algae, diatoms and higher plants. We observed LPAT1 and solLPAT genes to be ubiquitously present in nearly all genomes examined, whereas LPAT2 genes to have evolved more recently in the plant lineage. Phylogenetic analysis indicated that LPAT1, LPAT2 and solLPAT have convergently evolved through separate evolutionary paths and belong to three different gene families, which was further evidenced by their wide divergence at gene structure and sequence level. The genome distribution supports the hypothesis that each gene encoding a LPAT is not duplicated. Mapping of exon-intron structure of LPAT genes to the domain structure of proteins across different algal and plant species indicates that exon shuffling plays no role in the evolution of LPAT genes. Besides the previously defined motifs, several conserved consensus sequences were discovered which could be useful to distinguish different LPAT isoforms. Taken together, this study will enable the generation of experimental approximations to better understand the functional role of algal LPAT in lipid accumulation.

  19. A novel pathway for the biosynthesis of heme in Archaea: genome-based bioinformatic predictions and experimental evidence.

    PubMed

    Storbeck, Sonja; Rolfes, Sarah; Raux-Deery, Evelyne; Warren, Martin J; Jahn, Dieter; Layer, Gunhild

    2010-12-13

    Heme is an essential prosthetic group for many proteins involved in fundamental biological processes in all three domains of life. In Eukaryota and Bacteria heme is formed via a conserved and well-studied biosynthetic pathway. Surprisingly, in Archaea heme biosynthesis proceeds via an alternative route which is poorly understood. In order to formulate a working hypothesis for this novel pathway, we searched 59 completely sequenced archaeal genomes for the presence of gene clusters consisting of established heme biosynthetic genes and colocalized conserved candidate genes. Within the majority of archaeal genomes it was possible to identify such heme biosynthesis gene clusters. From this analysis we have been able to identify several novel heme biosynthesis genes that are restricted to archaea. Intriguingly, several of the encoded proteins display similarity to enzymes involved in heme d(1) biosynthesis. To initiate an experimental verification of our proposals two Methanosarcina barkeri proteins predicted to catalyze the initial steps of archaeal heme biosynthesis were recombinantly produced, purified, and their predicted enzymatic functions verified.

  20. A bioinformatics approach reveals seven nearly-complete RNA-virus genomes in bivalve RNA-seq data.

    PubMed

    Rosani, Umberto; Gerdol, Marco

    2016-10-18

    Viral metagenomics (viromics) can provide a great contribution in expanding the knowledge of viruses and the relationship with their hosts. Viromic studies on marine organisms are still at a very early stage and only little efforts have been spent in the identification of viruses associated to marine invertebrates to date, leaving the complexity of marine viromes associated to bivalve hosts almost completely unexplored. However, the potential use of viromic approaches in the management of viral diseases affecting aquacultured species has been recently evidenced by the flourishing of studies on the Ostreid herpesvirus type-1, which has been associated with bivalve mortality events. Herein we discuss an effective pipeline to retrieve and reconstruct nearly complete and previously unreported viral genomes from existing host RNA-seq data. As a case study, we report the identification of seven RNA-virus genomes within the frame of a highly diversified viral community that characterizes both Crassostrea gigas and Mytilus galloprovincialis samples collected from the lagoon of Goro (Italy).

  1. Genomic insights into ayurvedic and western approaches to personalized medicine.

    PubMed

    Prasher, Bhavana; Gibson, Greg; Mukerji, Mitali

    2016-03-01

    Ayurveda, an ancient Indian system of medicine documented and practised since 1500 B.C., follows a systems approach that has interesting parallels with contemporary personalized genomic medicine approaches to the understanding and management of health and disease. It is based on the trisutra, which are the three aspects of causes, features and therapeutics that are interconnected through a common organizing principle termed 'tridosha'. Tridosha comprise three ascertainable physiological entities; vata (kinetic), pitta (metabolic) and kapha (potential) that are pervasive across systems, work in conjunction with each other, respond to the external environment and maintain homeostasis. Each individual is born with a specific proportion of tridosha that are not only genetically determined but also influenced by the environment during foetal development. Jointly they determine a person's basic constitution, which is termed their 'prakriti'. Development and progressi on of different diseases with their subtypes are thought to depend on the origin and mechanism of perturbation of the doshas, and the aim of therapeutic practice is to ensure that the doshas retain their homeostatic state. Similarly, western systems biology epitomized by translational P4 medicine envisages the integration of multiscalar genetic, cellular, physiological and environmental networks to predict phenotypic outcomes of perturbations. In this perspective article, we aim to outline the shape of a unifying scaffold that may allow the two intellectual traditions to enhance one another. Specifically, we illustrate how a unique integrative 'Ayurgenomics' approach can be used to integrate the trisutra concept of Ayurveda with genomics. We observe biochemical and molecular correlates of prakriti and show how these differ significantly in processes that are linked to intermediate patho-phenotypes, known to take different course in diseases. We also observe a significant enr ichment of the highly connected

  2. Bioinformatics Analysis of the Complete Genome Sequence of the Mango Tree Pathogen Pseudomonas syringae pv. syringae UMAF0158 Reveals Traits Relevant to Virulence and Epiphytic Lifestyle

    PubMed Central

    Arrebola, Eva; Carrión, Víctor J.; Gutiérrez-Barranquero, José Antonio; Pérez-García, Alejandro; Ramos, Cayo; Cazorla, Francisco M.; de Vicente, Antonio

    2015-01-01

    The genome sequence of more than 100 Pseudomonas syringae strains has been sequenced to date; however only few of them have been fully assembled, including P. syringae pv. syringae B728a. Different strains of pv. syringae cause different diseases and have different host specificities; so, UMAF0158 is a P. syringae pv. syringae strain related to B728a but instead of being a bean pathogen it causes apical necrosis of mango trees, and the two strains belong to different phylotypes of pv.syringae and clades of P. syringae. In this study we report the complete sequence and annotation of P. syringae pv. syringae UMAF0158 chromosome and plasmid pPSS158. A comparative analysis with the available sequenced genomes of other 25 P. syringae strains, both closed (the reference genomes DC3000, 1448A and B728a) and draft genomes was performed. The 5.8 Mb UMAF0158 chromosome has 59.3% GC content and comprises 5017 predicted protein-coding genes. Bioinformatics analysis revealed the presence of genes potentially implicated in the virulence and epiphytic fitness of this strain. We identified several genetic features, which are absent in B728a, that may explain the ability of UMAF0158 to colonize and infect mango trees: the mangotoxin biosynthetic operon mbo, a gene cluster for cellulose production, two different type III and two type VI secretion systems, and a particular T3SS effector repertoire. A mutant strain defective in the rhizobial-like T3SS Rhc showed no differences compared to wild-type during its interaction with host and non-host plants and worms. Here we report the first complete sequence of the chromosome of a pv. syringae strain pathogenic to a woody plant host. Our data also shed light on the genetic factors that possibly determine the pathogenic and epiphytic lifestyle of UMAF0158. This work provides the basis for further analysis on specific mechanisms that enable this strain to infect woody plants and for the functional analysis of host specificity in the P

  3. Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis

    PubMed Central

    2012-01-01

    Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the

  4. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.

  5. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2016-03-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  6. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    PubMed

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations.

  7. Genomic expression differences between cutaneous cells from red hair color individuals and black hair color individuals based on bioinformatic analysis.

    PubMed

    Puig-Butille, Joan Anton; Gimenez-Xavier, Pol; Visconti, Alessia; Nsengimana, Jérémie; Garcia-García, Francisco; Tell-Marti, Gemma; Escamez, Maria José; Newton-Bishop, Julia; Bataille, Veronique; Del Río, Marcela; Dopazo, Joaquín; Falchi, Mario; Puig, Susana

    2016-12-24

    The MC1R gene plays a crucial role in pigmentation synthesis. Loss-of-function MC1R variants, which impair protein function, are associated with red hair color (RHC) phenotype and increased skin cancer risk. Cultured cutaneous cells bearing loss-of-function MC1R variants show a distinct gene expression profile compared to wild-type MC1R cultured cutaneous cells. We analysed the gene signature associated with RHC co-cultured melanocytes and keratinocytes by Protein-Protein interaction (PPI) network analysis to identify genes related with non-functional MC1R variants. From two detected networks, we selected 23 nodes as hub genes based on topological parameters. Differential expression of hub genes was then evaluated in healthy skin biopsies from RHC and black hair color (BHC) individuals. We also compared gene expression in melanoma tumors from individuals with RHC versus BHC. Gene expression in normal skin from RHC cutaneous cells showed dysregulation in 8 out of 23 hub genes (CLN3, ATG10, WIPI2, SNX2, GABARAPL2, YWHA, PCNA and GBAS). Hub genes did not differ between melanoma tumors in RHC versus BHC individuals. The study suggests that healthy skin cells from RHC individuals present a constitutive genomic deregulation associated with the red hair phenotype and identify novel genes involved in melanocyte biology.

  8. Generations of interdisciplinarity in bioinformatics.

    PubMed

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L

    2016-04-02

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this "borderland." As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature.

  9. Generations of interdisciplinarity in bioinformatics

    PubMed Central

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  10. Prescription medication changes following direct-to-consumer personal genomic testing: Findings from the Impact of Personal Genomics (PGen) Study

    PubMed Central

    Carere, Deanna Alexis; VanderWeele, Tyler; Vassy, Jason L.; van der Wouden, Cathelijne; Roberts, J. Scott; Kraft, Peter; Green, Robert C.

    2016-01-01

    Purpose To measure the frequency of prescription medication changes following direct-to-consumer personal genomic testing (DTC-PGT) and their association with the pharmacogenomic results received. Methods New DTC-PGT customers were enrolled in 2012 and completed surveys prior to return of results and 6 months post-results; DTC-PGT results were linked to survey data. ‘Atypical response’ pharmacogenomic results were defined as those indicating an increase or decrease in risk of an adverse drug event or likelihood of therapeutic benefit. At follow-up, participants reported prescription medication changes and health care provider consultation. Results Follow-up data were available from 961 participants, of which 54 (5.6%) reported changing a medication they were taking, or starting a new medication, due to their DTC-PGT results. Of these, 45 (83.3%) reported consulting with a health care provider regarding the change. Pharmacogenomic results were available for 961 participants, of which 875 (91.2%) received ≥1 atypical response result. For each such result received, the odds of reporting a prescription medication change increased 1.57 times (95% confidence interval = 1.17, 2.11). Conclusion Receipt of pharmacogenomic results indicating atypical drug response is common with DTC-PGT, and associated with prescription medication changes; however, fewer than 1% of consumers report unsupervised changes at 6 months post-testing. PMID:27657683

  11. CGAT: a model for immersive personalized training in computational genomics.

    PubMed

    Sims, David; Ponting, Chris P; Heger, Andreas

    2016-01-01

    How should the next generation of genomics scientists be trained while simultaneously pursuing high quality and diverse research? CGAT, the Computational Genomics Analysis and Training programme, was set up in 2010 by the UK Medical Research Council to complement its investment in next-generation sequencing capacity. CGAT was conceived around the twin goals of training future leaders in genome biology and medicine, and providing much needed capacity to UK science for analysing genome scale data sets. Here we outline the training programme employed by CGAT and describe how it dovetails with collaborative research projects to launch scientists on the road towards independent research careers in genomics.

  12. Taking Bioinformatics to Systems Medicine.

    PubMed

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  13. Personal Genome Sequencing in Ostensibly Healthy Individuals and the PeopleSeq Consortium

    PubMed Central

    Linderman, Michael D.; Nielsen, Daiva E.; Green, Robert C.

    2016-01-01

    Thousands of ostensibly healthy individuals have had their exome or genome sequenced, but a much smaller number of these individuals have received any personal genomic results from that sequencing. We term those projects in which ostensibly healthy participants can receive sequencing-derived genetic findings and may also have access to their genomic data as participatory predispositional personal genome sequencing (PPGS). Here we are focused on genome sequencing applied in a pre-symptomatic context and so define PPGS to exclude diagnostic genome sequencing intended to identify the molecular cause of suspected or diagnosed genetic disease. In this report we describe the design of completed and underway PPGS projects, briefly summarize the results reported to date and introduce the PeopleSeq Consortium, a newly formed collaboration of PPGS projects designed to collect much-needed longitudinal outcome data. PMID:27023617

  14. Bioinformatics meets parasitology.

    PubMed

    Cantacessi, C; Campbell, B E; Jex, A R; Young, N D; Hall, R S; Ranganathan, S; Gasser, R B

    2012-05-01

    The advent and integration of high-throughput '-omics' technologies (e.g. genomics, transcriptomics, proteomics, metabolomics, glycomics and lipidomics) are revolutionizing the way biology is done, allowing the systems biology of organisms to be explored. These technologies are now providing unique opportunities for global, molecular investigations of parasites. For example, studies of a transcriptome (all transcripts in an organism, tissue or cell) have become instrumental in providing insights into aspects of gene expression, regulation and function in a parasite, which is a major step to understanding its biology. The purpose of this article was to review recent applications of next-generation sequencing technologies and bioinformatic tools to large-scale investigations of the transcriptomes of parasitic nematodes of socio-economic significance (particularly key species of the order Strongylida) and to indicate the prospects and implications of these explorations for developing novel methods of parasite intervention.

  15. Personalized health care in 2013: a status report on the impact of genomics.

    PubMed

    Snyderman, Ralph

    2013-01-01

    This issue of the NCMJ describes the impact that genomics has had on the practice of medicine in the decade since the full sequencing of the human genome was completed in 2003. Specifically, it reports on how genomics is affecting health care delivery, describes the concept of personalized health care, and discusses the role that genomics plays in such care. The commentaries and sidebars that follow highlight the opportunities and challenges of bringing genomics into clinical practice. Reading these articles will hopefully give clinicians and others a better understanding of the benefits and limitations of genomic technologies. Emerging capabilities, resulting in part from genomic research, are providing an opportunity to move health care from a reactive, disease-focused model to one that is personalized, predictive, proactive, precise, and patient-centered. Genomics and related technologies have already changed many approaches to care, particularly in the field of oncology, and I believe they will help to transform our overall approach to the delivery of health care. With the rapidly accumulating capabilities being developed and the focus on patient-centered and personalized care, I expect that the practice of medicine will become proactive and personalized within the next decade.

  16. Informing the Design of Direct-to-Consumer Interactive Personal Genomics Reports

    PubMed Central

    Shaer, Orit; Okerlund, Johanna; Balestra, Martina; Stowell, Elizabeth; Ascher, Laura; Bi, Joanna; Schlenker, Claire; Ball, Madeleine

    2015-01-01

    Background In recent years, people who sought direct-to-consumer genetic testing services have been increasingly confronted with an unprecedented amount of personal genomic information, which influences their decisions, emotional state, and well-being. However, these users of direct-to-consumer genetic services, who vary in their education and interests, frequently have little relevant experience or tools for understanding, reasoning about, and interacting with their personal genomic data. Online interactive techniques can play a central role in making personal genomic data useful for these users. Objective We sought to (1) identify the needs of diverse users as they make sense of their personal genomic data, (2) consequently develop effective interactive visualizations of genomic trait data to address these users’ needs, and (3) evaluate the effectiveness of the developed visualizations in facilitating comprehension. Methods The first two user studies, conducted with 63 volunteers in the Personal Genome Project and with 36 personal genomic users who participated in a design workshop, respectively, employed surveys and interviews to identify the needs and expectations of diverse users. Building on the two initial studies, the third study was conducted with 730 Amazon Mechanical Turk users and employed a controlled experimental design to examine the effectiveness of different design interventions on user comprehension. Results The first two studies identified searching, comparing, sharing, and organizing data as fundamental to users’ understanding of personal genomic data. The third study demonstrated that interactive and visual design interventions could improve the understandability of personal genomic reports for consumers. In particular, results showed that a new interactive bubble chart visualization designed for the study resulted in the highest comprehension scores, as well as the highest perceived comprehension scores. These scores were significantly

  17. Motivations and Perceptions of Early Adopters of Personalized Genomics: Perspectives from Research Participants

    PubMed Central

    Gollust, S.E.; Gordon, E.S.; Zayac, C.; Griffin, G.; Christman, M.F.; Pyeritz, R.E.; Wawak, L.; Bernhardt, B.A.

    2011-01-01

    Background/Aims: To predict the potential public health impact of personal genomics, empirical research on public perceptions of these services is needed. In this study, ‘early adopters’ of personal genomics were surveyed to assess their motivations, perceptions and intentions. Methods: Participants were recruited from everyone who registered to attend an enrollment event for the Coriell Personalized Medicine Collaborative, a United States-based (Camden, N.J.) research study of the utility of personalized medicine, between March 31, 2009 and April 1, 2010 (n = 369). Participants completed an Internet-based survey about their motivations, awareness of personalized medicine, perceptions of study risks and benefits, and intentions to share results with health care providers. Results: Respondents were motivated to participate for their own curiosity and to find out their disease risk to improve their health. Fewer than 10% expressed deterministic perspectives about genetic risk, but 32% had misperceptions about the research study or personal genomic testing. Most respondents perceived the study to have health-related benefits. Nearly all (92%) intended to share their results with physicians, primarily to request specific medical recommendations. Conclusion: Early adopters of personal genomics are prospectively enthusiastic about using genomic profiling information to improve their health, in close consultation with their physicians. This suggests that early users (i.e. through direct-to-consumer companies or research) may follow up with the health care system. Further research should address whether intentions to seek care match actual behaviors. PMID:21654153

  18. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  19. What can whole genome expression data tell us about the ecology and evolution of personality?

    PubMed

    Bell, Alison M; Aubin-Horth, Nadia

    2010-12-27

    Consistent individual differences in behaviour, aka personality, pose several evolutionary questions. For example, it is difficult to explain within-individual consistency in behaviour because behavioural plasticity is often advantageous. In addition, selection erodes heritable behavioural variation that is related to fitness, therefore we wish to know the mechanisms that can maintain between-individual variation in behaviour. In this paper, we argue that whole genome expression data can reveal new insights into the proximate mechanisms underlying personality, as well as its evolutionary consequences. After introducing the basics of whole genome expression analysis, we show how whole genome expression data can be used to understand whether behaviours in different contexts are affected by the same molecular mechanisms. We suggest strategies for using the power of genomics to understand what maintains behavioural variation, to study the evolution of behavioural correlations and to compare personality traits across diverse organisms.

  20. Ethical considerations of research policy for personal genome analysis: the approach of the Genome Science Project in Japan.

    PubMed

    Minari, Jusaku; Shirai, Tetsuya; Kato, Kazuto

    2014-12-01

    As evidenced by high-throughput sequencers, genomic technologies have recently undergone radical advances. These technologies enable comprehensive sequencing of personal genomes considerably more efficiently and less expensively than heretofore. These developments present a challenge to the conventional framework of biomedical ethics; under these changing circumstances, each research project has to develop a pragmatic research policy. Based on the experience with a new large-scale project-the Genome Science Project-this article presents a novel approach to conducting a specific policy for personal genome research in the Japanese context. In creating an original informed-consent form template for the project, we present a two-tiered process: making the draft of the template following an analysis of national and international policies; refining the draft template in conjunction with genome project researchers for practical application. Through practical use of the template, we have gained valuable experience in addressing challenges in the ethical review process, such as the importance of sharing details of the latest developments in genomics with members of research ethics committees. We discuss certain limitations of the conventional concept of informed consent and its governance system and suggest the potential of an alternative process using information technology.

  1. An analytical framework for optimizing variant discovery from personal genomes

    PubMed Central

    Highnam, Gareth; Wang, Jason J.; Kusler, Dean; Zook, Justin; Vijayan, Vinaya; Leibovich, Nir; Mittelman, David

    2015-01-01

    The standardization and performance testing of analysis tools is a prerequisite to widespread adoption of genome-wide sequencing, particularly in the clinic. However, performance testing is currently complicated by the paucity of standards and comparison metrics, as well as by the heterogeneity in sequencing platforms, applications and protocols. Here we present the genome comparison and analytic testing (GCAT) platform to facilitate development of performance metrics and comparisons of analysis tools across these metrics. Performance is reported through interactive visualizations of benchmark and performance testing data, with support for data slicing and filtering. The platform is freely accessible at http://www.bioplanet.com/gcat. PMID:25711446

  2. Simultaneous Whole Mitochondrial Genome Sequencing with Short Overlapping Amplicons Suitable for Degraded DNA Using the Ion Torrent Personal Genome Machine

    PubMed Central

    Chaitanya, Lakshmi; Ralf, Arwin; van Oven, Mannis; Kupiec, Tomasz; Chang, Joseph; Lagacé, Robert

    2015-01-01

    ABSTRACT Whole mitochondrial (mt) genome analysis enables a considerable increase in analysis throughput, and improves the discriminatory power to the maximum possible phylogenetic resolution. Most established protocols on the different massively parallel sequencing (MPS) platforms, however, invariably involve the PCR amplification of large fragments, typically several kilobases in size, which may fail due to mtDNA fragmentation in the available degraded materials. We introduce a MPS tiling approach for simultaneous whole human mt genome sequencing using 161 short overlapping amplicons (average 200 bp) with the Ion Torrent Personal Genome Machine. We illustrate the performance of this new method by sequencing 20 DNA samples belonging to different worldwide mtDNA haplogroups. Additional quality control, particularly regarding the potential detection of nuclear insertions of mtDNA (NUMTs), was performed by comparative MPS analysis using the conventional long‐range amplification method. Preliminary sensitivity testing revealed that detailed haplogroup inference was feasible with 100 pg genomic input DNA. Complete mt genome coverage was achieved from DNA samples experimentally degraded down to genomic fragment sizes of about 220 bp, and up to 90% coverage from naturally degraded samples. Overall, we introduce a new approach for whole mt genome MPS analysis from degraded and nondegraded materials relevant to resolve and infer maternal genetic ancestry at complete resolution in anthropological, evolutionary, medical, and forensic applications. PMID:26387877

  3. Simultaneous Whole Mitochondrial Genome Sequencing with Short Overlapping Amplicons Suitable for Degraded DNA Using the Ion Torrent Personal Genome Machine.

    PubMed

    Chaitanya, Lakshmi; Ralf, Arwin; van Oven, Mannis; Kupiec, Tomasz; Chang, Joseph; Lagacé, Robert; Kayser, Manfred

    2015-12-01

    Whole mitochondrial (mt) genome analysis enables a considerable increase in analysis throughput, and improves the discriminatory power to the maximum possible phylogenetic resolution. Most established protocols on the different massively parallel sequencing (MPS) platforms, however, invariably involve the PCR amplification of large fragments, typically several kilobases in size, which may fail due to mtDNA fragmentation in the available degraded materials. We introduce a MPS tiling approach for simultaneous whole human mt genome sequencing using 161 short overlapping amplicons (average 200 bp) with the Ion Torrent Personal Genome Machine. We illustrate the performance of this new method by sequencing 20 DNA samples belonging to different worldwide mtDNA haplogroups. Additional quality control, particularly regarding the potential detection of nuclear insertions of mtDNA (NUMTs), was performed by comparative MPS analysis using the conventional long-range amplification method. Preliminary sensitivity testing revealed that detailed haplogroup inference was feasible with 100 pg genomic input DNA. Complete mt genome coverage was achieved from DNA samples experimentally degraded down to genomic fragment sizes of about 220 bp, and up to 90% coverage from naturally degraded samples. Overall, we introduce a new approach for whole mt genome MPS analysis from degraded and nondegraded materials relevant to resolve and infer maternal genetic ancestry at complete resolution in anthropological, evolutionary, medical, and forensic applications.

  4. Bioinformatics and Moonlighting Proteins

    PubMed Central

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein–protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations – it requires the existence of multialigned family protein sequences – but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  5. Genome Science and Personalized Cancer Treatment (LBNL Summer Lecture Series)

    SciTech Connect

    Gray, Joe

    2009-08-04

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  6. Genome Science and Personalized Cancer Treatment (LBNL Summer Lecture Series)

    ScienceCinema

    Gray, Joe

    2016-07-12

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  7. Computational biology and bioinformatics in Nigeria.

    PubMed

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  8. Population Genetic Inference from Personal Genome Data: Impact of Ancestry and Admixture on Human Genomic Variation

    PubMed Central

    Kidd, Jeffrey M.; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D.; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F.; Peckham, Heather E.; Omberg, Larsson; Bormann Chung, Christina A.; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G.; Russell, Archie; Reynolds, Andy; Clark, Andrew G.; Reese, Martin G.; Lincoln, Stephen E.; Butte, Atul J.; De La Vega, Francisco M.; Bustamante, Carlos D.

    2012-01-01

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas—70% of the European ancestry in today’s African Americans dates back to European gene flow happening only 7–8 generations ago. PMID:23040495

  9. Personalized Medicine in a New Genomic Era: Ethical and Legal Aspects.

    PubMed

    Shoaib, Maria; Rameez, Mansoor Ali Merchant; Hussain, Syed Ather; Madadin, Mohammed; Menezes, Ritesh G

    2016-11-28

    The genome of two completely unrelated individuals is quite similar apart from minor variations called single nucleotide polymorphisms which contribute to the uniqueness of each and every person. These single nucleotide polymorphisms are of great interest clinically as they are useful in figuring out the susceptibility of certain individuals to particular diseases and for recognizing varied responses to pharmacological interventions. This gives rise to the idea of 'personalized medicine' as an exciting new therapeutic science in this genomic era. Personalized medicine suggests a unique treatment strategy based on an individual's genetic make-up. Its key principles revolve around applied pharmaco-genomics, pharmaco-kinetics and pharmaco-proteomics. Herein, the ethical and legal aspects of personalized medicine in a new genomic era are briefly addressed. The ultimate goal is to comprehensively recognize all relevant forms of genetic variation in each individual and be able to interpret this information in a clinically meaningful manner within the ambit of ethical and legal considerations. The authors of this article firmly believe that personalized medicine has the potential to revolutionize the current landscape of medicine as it makes its way into clinical practice.

  10. Knowledge and attitudes to personal genomics testing for complex diseases among Nigerians

    PubMed Central

    2014-01-01

    Background The study examined the knowledge and attitudes to personal genomics testing for complex diseases among Nigerians and identified how the knowledge and attitudes vary with gender, age, religion, education and related factors. Methods Data were collected using qualitative method in 2 districts of the Federal Capital Territory. In the study, eight (8) Focused Group Discussions (FGDs) and twenty seven (27) Key Informant Interviews (KIIs) were conducted. Participants for the research were recruited among healthy Nigerians, individuals with complex diseases, health care professionals, community leaders and health policy makers. Result Analysis of the result showed that most respondents in both FGDs and KIIs had limited knowledge about genomics test initially. Their understanding of the test however improved after explanation on its concept. Participants showed positive attitude towards genomics tests. Nevertheless they expressed fear over direct to consumer personal genomics testing, testing unborn babies and disclosure of results to third parties. Culture and religion were found to influence the perspectives of respondents on genomics test particularly those aspects that could either directly contradict their beliefs and practices or lead to actions which contradict them. Conclusion In conclusion, most Nigerians interviewed had limited knowledge of genomics test but with supportive attitude towards its use in predicting future risk of complex diseases after understanding the test concept. Genomics testing for complex diseases was not a common practice in Nigeria. PMID:24766930

  11. openSNP–A Crowdsourced Web Resource for Personal Genomics

    PubMed Central

    Greshake, Bastian; Bayer, Philipp E.; Rausch, Helge; Reda, Julia

    2014-01-01

    Genome-Wide Association Studies are widely used to correlate phenotypic traits with genetic variants. These studies usually compare the genetic variation between two groups to single out certain Single Nucleotide Polymorphisms (SNPs) that are linked to a phenotypic variation in one of the groups. However, it is necessary to have a large enough sample size to find statistically significant correlations. Direct-To-Consumer (DTC) genetic testing can supply additional data: DTC-companies offer the analysis of a large amount of SNPs for an individual at low cost without the need to consult a physician or geneticist. Over 100,000 people have already been genotyped through Direct-To-Consumer genetic testing companies. However, this data is not public for a variety of reasons and thus cannot be used in research. It seems reasonable to create a central open data repository for such data. Here we present the web platform openSNP, an open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies. openSNP is hosted at http://www.opensnp.org, and the code is released under MIT-license at http://github.com/gedankenstuecke/snpr. PMID:24647222

  12. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders

    PubMed Central

    Lo, Min-Tzu; Hinds, David A.; Tung, Joyce Y.; Franz, Carol; Fan, Chun-Chieh; Wang, Yunpeng; Smeland, Olav B.; Schork, Andrew; Holland, Dominic; Kauppi, Karolina; Sanyal, Nilotpal; Escott-Price, Valentina; Smith, Daniel J.; O'Donovan, Michael; Stefansson, Hreinn; Bjornsdottir, Gyda; Thorgeirsson, Thorgeir E.; Stefansson, Kari; McEvoy, Linda K.; Dale, Anders M.; Andreassen, Ole A.; Chen, Chi-Hua

    2017-01-01

    Summary Personality is influenced by genetic and environmental factors1, and associated with mental health. However, the underlying genetic determinants are largely unknown. We identified six genetic loci, including five novel loci2,3, significantly associated with personality traits in a meta-analysis of genome-wide association studies (N=123,132–260,861). Of these genome-wide significant loci, extraversion was associated with variants in WSCD2 and near PCDH15, and neuroticism with variants on chromosome 8p23.1 and in L3MBTL2. We performed a principal component analysis to extract major dimensions underlying genetic variations among five personality traits and six psychiatric disorders (N=5,422–18,759). The first genetic dimension separated personality traits and psychiatric disorders, except that neuroticism and openness to experience were clustered with the disorders. High genetic correlations were found between extraversion and attention-deficit/hyperactivity disorder (ADHD), and between openness and schizophrenia/bipolar disorder. The second genetic dimension was closely aligned with extraversion-introversion and grouped neuroticism with internalizing psychopathology (e.g., depression/anxiety). PMID:27918536

  13. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    PubMed Central

    Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  14. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes

    PubMed Central

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-01

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics. PMID:26586806

  15. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes.

    PubMed

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-04

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics.

  16. Bioinformatic identification of plant peptides.

    PubMed

    Lease, Kevin A; Walker, John C

    2010-01-01

    Plant peptides play a number of important roles in defence, development and many other aspects of plant physiology. Identifying additional peptide sequences provides the starting point to investigate their function using molecular, genetic or biochemical techniques. Due to their small size, identifying peptide sequences may not succeed using the default bioinformatic approaches that work well for average-sized proteins. There are two general scenarios related to bioinformatic identification of peptides to be discussed in this paper. In the first scenario, one already has the sequence of a plant peptide and is trying to find more plant peptides with some sequence similarity to the starting peptide. To do this, the Basic Local Alignment Search Tool (BLAST) is employed, with the parameters adjusted to be more favourable for identifying potential peptide matches. A second scenario involves trying to identify plant peptides without using sequence similarity searches to known plant peptides. In this approach, features such as protein size and the presence of a cleavable amino-terminal signal peptide are used to screen annotated proteins. A variation of this method can be used to screen for unannotated peptides from genomic sequences. Bioinformatic resources related to Arabidopsis thaliana will be used to illustrate these approaches.

  17. A Platform for Designing Genome-Based Personalized Immunotherapy or Vaccine against Cancer

    PubMed Central

    Gupta, Sudheer; Chaudhary, Kumardeep; Dhanda, Sandeep Kumar; Kumar, Rahul; Kumar, Shailesh; Sehgal, Manika; Nagpal, Gandharva

    2016-01-01

    Due to advancement in sequencing technology, genomes of thousands of cancer tissues or cell-lines have been sequenced. Identification of cancer-specific epitopes or neoepitopes from cancer genomes is one of the major challenges in the field of immunotherapy or vaccine development. This paper describes a platform Cancertope, developed for designing genome-based immunotherapy or vaccine against a cancer cell. Broadly, the integrated resources on this platform are apportioned into three precise sections. First section explains a cancer-specific database of neoepitopes generated from genome of 905 cancer cell lines. This database harbors wide range of epitopes (e.g., B-cell, CD8+ T-cell, HLA class I, HLA class II) against 60 cancer-specific vaccine antigens. Second section describes a partially personalized module developed for predicting potential neoepitopes against a user-specific cancer genome. Finally, we describe a fully personalized module developed for identification of neoepitopes from genomes of cancerous and healthy cells of a cancer-patient. In order to assist the scientific community, wide range of tools are incorporated in this platform that includes screening of epitopes against human reference proteome (http://www.imtech.res.in/raghava/cancertope/). PMID:27832200

  18. Bioinformatics in the information age

    SciTech Connect

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  19. Integrating functional genomics to accelerate mechanistic personalized medicine

    PubMed Central

    Tyner, Jeffrey W.

    2017-01-01

    The advent of deep sequencing technologies has resulted in the deciphering of tremendous amounts of genetic information. These data have led to major discoveries, and many anecdotes now exist of individual patients whose clinical outcomes have benefited from novel, genetically guided therapeutic strategies. However, the majority of genetic events in cancer are currently undrugged, leading to a biological gap between understanding of tumor genetic etiology and translation to improved clinical approaches. Functional screening has made tremendous strides in recent years with the development of new experimental approaches to studying ex vivo and in vivo drug sensitivity. Numerous discoveries and anecdotes also exist for translation of functional screening into novel clinical strategies; however, the current clinical application of functional screening remains largely confined to small clinical trials at specific academic centers. The intersection between genomic and functional approaches represents an ideal modality to accelerate our understanding of drug sensitivities as they relate to specific genetic events and further understand the full mechanisms underlying drug sensitivity patterns. PMID:28299357

  20. Integrating functional genomics to accelerate mechanistic personalized medicine.

    PubMed

    Tyner, Jeffrey W

    2017-03-01

    The advent of deep sequencing technologies has resulted in the deciphering of tremendous amounts of genetic information. These data have led to major discoveries, and many anecdotes now exist of individual patients whose clinical outcomes have benefited from novel, genetically guided therapeutic strategies. However, the majority of genetic events in cancer are currently undrugged, leading to a biological gap between understanding of tumor genetic etiology and translation to improved clinical approaches. Functional screening has made tremendous strides in recent years with the development of new experimental approaches to studying ex vivo and in vivo drug sensitivity. Numerous discoveries and anecdotes also exist for translation of functional screening into novel clinical strategies; however, the current clinical application of functional screening remains largely confined to small clinical trials at specific academic centers. The intersection between genomic and functional approaches represents an ideal modality to accelerate our understanding of drug sensitivities as they relate to specific genetic events and further understand the full mechanisms underlying drug sensitivity patterns.

  1. Genomics and epigenomics: new promises of personalized medicine for cancer patients.

    PubMed

    Schweiger, Michal-Ruth; Barmeyer, Christian; Timmermann, Bernd

    2013-09-01

    Recent years have brought about a marked extension of our understanding of the somatic basis of cancer. Parallel to the large-scale investigation of diverse tumor genomes the knowledge arose that cancer pathologies are most often not restricted to single genomic events. In contrast, a large number of different alterations in the genomes and epigenomes come together and promote the malignant transformation. The combination of mutations, structural variations and epigenetic alterations differs between each tumor, making individual diagnosis and treatment strategies necessary. This view is summarized in the new discipline of personalized medicine. To satisfy the ideas of this approach each tumor needs to be fully characterized and individual diagnostic and therapeutic strategies designed. Here, we will discuss the power of high-throughput sequencing technologies for genomic and epigenomic analyses. We will provide insight into the current status and how these technologies can be transferred to routine clinical usage.

  2. Attitudes towards Social Networking and Sharing Behaviors among Consumers of Direct-to-Consumer Personal Genomics

    PubMed Central

    Lee, Sandra Soo-Jin; Vernez, Simone L.; Ormond, K.E.; Granovetter, Mark

    2013-01-01

    Little is known about how consumers of direct-to-consumer personal genetic services share personal genetic risk information. In an age of ubiquitous online networking and rapid development of social networking tools, understanding how consumers share personal genetic risk assessments is critical in the development of appropriate and effective policies. This exploratory study investigates how consumers share personal genetic information and attitudes towards social networking behaviors. Methods: Adult participants aged 23 to 72 years old who purchased direct-to-consumer genetic testing from a personal genomics company were administered a web-based survey regarding their sharing activities and social networking behaviors related to their personal genetic test results. Results: 80 participants completed the survey; of those, 45% shared results on Facebook and 50.9% reported meeting or reconnecting with more than 10 other individuals through the sharing of their personal genetic information. For help interpreting test results, 70.4% turned to Internet websites and online sources, compared to 22.7% who consulted their healthcare providers. Amongst participants, 51.8% reported that they believe the privacy of their personal genetic information would be breached in the future. Conclusion: Consumers actively utilize online social networking tools to help them share and interpret their personal genetic information. These findings suggest a need for careful consideration of policy recommendations in light of the current ambiguity of regulation and oversight of consumer initiated sharing activities. PMID:25562728

  3. [Applied problems of mathematical biology and bioinformatics].

    PubMed

    Lakhno, V D

    2011-01-01

    Mathematical biology and bioinformatics represent a new and rapidly progressing line of investigations which emerged in the course of work on the project "Human genome". The main applied problems of these sciences are grug design, patient-specific medicine and nanobioelectronics. It is shown that progress in the technology of mass sequencing of the human genome has set the stage for starting the national program on patient-specific medicine.

  4. Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM)☆

    PubMed Central

    Parson, Walther; Strobl, Christina; Huber, Gabriela; Zimmermann, Bettina; Gomes, Sibylle M.; Souto, Luis; Fendt, Liane; Delport, Rhena; Langit, Reina; Wootton, Sharon; Lagacé, Robert; Irwin, Jodi

    2013-01-01

    Insights into the human mitochondrial phylogeny have been primarily achieved by sequencing full mitochondrial genomes (mtGenomes). In forensic genetics (partial) mtGenome information can be used to assign haplotypes to their phylogenetic backgrounds, which may, in turn, have characteristic geographic distributions that would offer useful information in a forensic case. In addition and perhaps even more relevant in the forensic context, haplogroup-specific patterns of mutations form the basis for quality control of mtDNA sequences. The current method for establishing (partial) mtDNA haplotypes is Sanger-type sequencing (STS), which is laborious, time-consuming, and expensive. With the emergence of Next Generation Sequencing (NGS) technologies, the body of available mtDNA data can potentially be extended much more quickly and cost-efficiently. Customized chemistries, laboratory workflows and data analysis packages could support the community and increase the utility of mtDNA analysis in forensics. We have evaluated the performance of mtGenome sequencing using the Personal Genome Machine (PGM) and compared the resulting haplotypes directly with conventional Sanger-type sequencing. A total of 64 mtGenomes (>1 million bases) were established that yielded high concordance with the corresponding STS haplotypes (<0.02% differences). About two-thirds of the differences were observed in or around homopolymeric sequence stretches. In addition, the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of alignment software would be desirable to facilitate the application of NGS in mtDNA forensic genetics. PMID:23948325

  5. Informed consent in direct-to-consumer personal genome testing: the outline of a model between specific and generic consent.

    PubMed

    Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N

    2014-09-01

    Broad genome-wide testing is increasingly finding its way to the public through the online direct-to-consumer marketing of so-called personal genome tests. Personal genome tests estimate genetic susceptibilities to multiple diseases and other phenotypic traits simultaneously. Providers commonly make use of Terms of Service agreements rather than informed consent procedures. However, to protect consumers from the potential physical, psychological and social harms associated with personal genome testing and to promote autonomous decision-making with regard to the testing offer, we argue that current practices of information provision are insufficient and that there is a place--and a need--for informed consent in personal genome testing, also when it is offered commercially. The increasing quantity, complexity and diversity of most testing offers, however, pose challenges for information provision and informed consent. Both specific and generic models for informed consent fail to meet its moral aims when applied to personal genome testing. Consumers should be enabled to know the limitations, risks and implications of personal genome testing and should be given control over the genetic information they do or do not wish to obtain. We present the outline of a new model for informed consent which can meet both the norm of providing sufficient information and the norm of providing understandable information. The model can be used for personal genome testing, but will also be applicable to other, future forms of broad genetic testing or screening in commercial and clinical settings.

  6. Deep brain stimulation, brain maps and personalized medicine: lessons from the human genome project.

    PubMed

    Fins, Joseph J; Shapiro, Zachary E

    2014-01-01

    Although the appellation of personalized medicine is generally attributed to advanced therapeutics in molecular medicine, deep brain stimulation (DBS) can also be so categorized. Like its medical counterpart, DBS is a highly personalized intervention that needs to be tailored to a patient's individual anatomy. And because of this, DBS like more conventional personalized medicine, can be highly specific where the object of care is an N = 1. But that is where the similarities end. Besides their differing medical and surgical provenances, these two varieties of personalized medicine have had strikingly different impacts. The molecular variant, though of a more recent vintage has thrived and is experiencing explosive growth, while DBS still struggles to find a sustainable therapeutic niche. Despite its promise, and success as a vetted treatment for drug resistant Parkinson's Disease, DBS has lagged in broadening its development, often encountering regulatory hurdles and financial barriers necessary to mount an adequate number of quality trials. In this paper we will consider why DBS-or better yet neuromodulation-has encountered these challenges and contrast this experience with the more successful advance of personalized medicine. We will suggest that personalized medicine and DBS's differential performance can be explained as a matter of timing and complexity. We believe that DBS has struggled because it has been a journey of scientific exploration conducted without a map. In contrast to molecular personalized medicine which followed the mapping of the human genome and the Human Genome Project, DBS preceded plans for the mapping of the human brain. We believe that this sequence has given personalized medicine a distinct advantage and that the fullest potential of DBS will be realized both as a cartographical or electrophysiological probe and as a modality of personalized medicine.

  7. Highlighting computations in bioscience and bioinformatics: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB07).

    PubMed

    Lu, Guoqing; Ni, Jun

    2008-05-28

    The Second Symposium on Computations in Bioinformatics and Bioscience (SCBB07) was held in Iowa City, Iowa, USA, on August 13-15, 2007. This annual event attracted dozens of bioinformatics professionals and students, who are interested in solving emerging computational problems in bioscience, from China, Japan, Taiwan and the United States. The Scientific Committee of the symposium selected 18 peer-reviewed papers for publication in this supplemental issue of BMC Bioinformatics. These papers cover a broad spectrum of topics in computational biology and bioinformatics, including DNA, protein and genome sequence analysis, gene expression and microarray analysis, computational proteomics and protein structure classification, systems biology and machine learning.

  8. The miRNA targetome of coronary artery disease is perturbed by functional polymorphisms identified and prioritized by in-depth bioinformatics analyses exploiting genome-wide association studies.

    PubMed

    Bastami, Milad; Nariman-Saleh-Fam, Ziba; Saadatian, Zahra; Nariman-Saleh-Fam, Lida; Omrani, Mir Davood; Ghaderian, Sayyed Mohammad Hossein; Masotti, Andrea

    2016-12-05

    In recent years, genome-wide association studies (GWAS) have made great progress in elucidating the genetic influence on complex traits. An overwhelming number of GWAS signals resides in regulatory elements, therefore most post-GWAS studies focused only on transcriptional regulatory variants. However, recent findings have expanded the spectrum of trait/disease-associated regulatory variants beyond transcriptional level and highlighted the importance of post-transcriptional variants like those in miRNA targetome. The present work integrated genome-wide association data of coronary artery disease (CAD) with population-specific linkage disequilibrium structures from 1000 Genomes Project to map disease associations to miRNA targetome. Moreover, we performed a variety of functional prediction analyses to prioritize disease-associated variants (DAVs) influencing miRNA targetome and in-silico analyses to get insights into their functional significance. In conclusion, although the role of miRNA targetome variations in the development of CAD still has to be fully elucidated, we provided a systematic bioinformatics approach to the miRNA targetome variations in CAD. The results of this study will be valuable for researchers interested in the identification of CAD GWAS signals that may implicate polymorphic miRNA targeting.

  9. The UCSC Genome Browser

    PubMed Central

    Karolchik, Donna; Hinrichs, Angie S.; Kent, W. James

    2011-01-01

    The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation “tracks.” The annotations generated by the UCSC Genome Bioinformatics Group and external collaborators include gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload personal datasets in a wide variety of formats as custom annotation tracks in both browsers for research or educational purposes. PMID:21975940

  10. The potential of translational bioinformatics approaches for pharmacology research.

    PubMed

    Li, Lang

    2015-10-01

    The field of bioinformatics has allowed the interpretation of massive amounts of biological data, ushering in the era of 'omics' to biomedical research. Its potential impact on pharmacology research is enormous and it has shown some emerging successes. A full realization of this potential, however, requires standardized data annotation for large health record databases and molecular data resources. Improved standardization will further stimulate the development of system pharmacology models, using translational bioinformatics methods. This new translational bioinformatics paradigm is highly complementary to current pharmacological research fields, such as personalized medicine, pharmacoepidemiology and drug discovery. In this review, I illustrate the application of transformational bioinformatics to research in numerous pharmacology subdisciplines.

  11. Targeting the undruggable: immunotherapy meets personalized oncology in the genomic era.

    PubMed

    Martin, S D; Coukos, G; Holt, R A; Nelson, B H

    2015-12-01

    Owing to recent advances in genomic technologies, personalized oncology is poised to fundamentally alter cancer therapy. In this paradigm, the mutational and transcriptional profiles of tumors are assessed, and personalized treatments are designed based on the specific molecular abnormalities relevant to each patient's cancer. To date, such approaches have yielded impressive clinical responses in some patients. However, a major limitation of this strategy has also been revealed: the vast majority of tumor mutations are not targetable by current pharmacological approaches. Immunotherapy offers a promising alternative to exploit tumor mutations as targets for clinical intervention. Mutated proteins can give rise to novel antigens (called neoantigens) that are recognized with high specificity by patient T cells. Indeed, neoantigen-specific T cells have been shown to underlie clinical responses to many standard treatments and immunotherapeutic interventions. Moreover, studies in mouse models targeting neoantigens, and early results from clinical trials, have established proof of concept for personalized immunotherapies targeting next-generation sequencing identified neoantigens. Here, we review basic immunological principles related to T-cell recognition of neoantigens, and we examine recent studies that use genomic data to design personalized immunotherapies. We discuss the opportunities and challenges that lie ahead on the road to improving patient outcomes by incorporating immunotherapy into the paradigm of personalized oncology.

  12. Incorporating bioinformatics into biological science education in Nigeria: prospects and challenges.

    PubMed

    Ojo, O O; Omabe, M

    2011-06-01

    The urgency to process and analyze the deluge of data created by proteomics and genomics studies worldwide has caused bioinformatics to gain prominence and importance. However, its multidisciplinary nature has created a unique demand for specialist trained in both biology and computing. Several countries, in response to this challenge, have developed a number of manpower training programmes. This review presents a description of the meaning, scope, history and development of bioinformatics with focus on prospects and challenges facing bioinformatics education worldwide. The paper also provides an overview of attempts at the introduction of bioinformatics in Nigeria; describes the existing bioinformatics scenario in Nigeria and suggests strategies for effective bioinformatics education in Nigeria.

  13. Pathway analysis of genome-wide association datasets of personality traits.

    PubMed

    Kim, H-N; Kim, B-H; Cho, J; Ryu, S; Shin, H; Sung, J; Shin, C; Cho, N H; Sung, Y A; Choi, B-O; Kim, H-L

    2015-04-01

    Although several genome-wide association (GWA) studies of human personality have been recently published, genetic variants that are highly associated with certain personality traits remain unknown, due to difficulty reproducing results. To further investigate these genetic variants, we assessed biological pathways using GWA datasets. Pathway analysis using GWA data was performed on 1089 Korean women whose personality traits were measured with the Revised NEO Personality Inventory for the 5-factor model of personality. A total of 1042 pathways containing 8297 genes were included in our study. Of these, 14 pathways were highly enriched with association signals that were validated in 1490 independent samples. These pathways include association of: Neuroticism with axon guidance [L1 cell adhesion molecule (L1CAM) interactions]; Extraversion with neuronal system and voltage-gated potassium channels; Agreeableness with L1CAM interaction, neurotransmitter receptor binding and downstream transmission in postsynaptic cells; and Conscientiousness with the interferon-gamma and platelet-derived growth factor receptor beta polypeptide pathways. Several genes that contribute to top-ranked pathways in this study were previously identified in GWA studies or by pathway analysis in schizophrenia or other neuropsychiatric disorders. Here we report the first pathway analysis of all five personality traits. Importantly, our analysis identified novel pathways that contribute to understanding the etiology of personality traits.

  14. Deep learning in bioinformatics.

    PubMed

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2016-07-29

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies.

  15. From prenatal genomic diagnosis to fetal personalized medicine: progress and challenges

    PubMed Central

    Bianchi, Diana W

    2015-01-01

    Thus far, the focus of personalized medicine has been the prevention and treatment of conditions that affect adults. Although advances in genetic technology have been applied more frequently to prenatal diagnosis than to fetal treatment, genetic and genomic information is beginning to influence pregnancy management. Recent developments in sequencing the fetal genome combined with progress in understanding fetal physiology using gene expression arrays indicate that we could have the technical capabilities to apply an individualized medicine approach to the fetus. Here I review recent advances in prenatal genetic diagnostics, the challenges associated with these new technologies and how the information derived from them can be used to advance fetal care. Historically, the goal of prenatal diagnosis has been to provide an informed choice to prospective parents. We are now at a point where that goal can and should be expanded to incorporate genetic, genomic and transcriptomic data to develop new approaches to fetal treatment. PMID:22772565

  16. Towards personalized agriculture: what chemical genomics can bring to plant biotechnology.

    PubMed

    Stokes, Michael E; McCourt, Peter

    2014-01-01

    In contrast to the dominant drug paradigm in which compounds were developed to "fit all," new models focused around personalized medicine are appearing in which treatments are developed and customized for individual patients. The agricultural biotechnology industry (Ag-biotech) should also think about these new personalized models. For example, most common herbicides are generic in action, which led to the development of genetically modified crops to add specificity. The ease and accessibility of modern genomic analysis, when wedded to accessible large chemical space, should facilitate the discovery of chemicals that are more selective in their utility. Is it possible to develop species-selective herbicides and growth regulators? More generally put, is plant research at a stage where chemicals can be developed that streamline plant development and growth to various environments? We believe the advent of chemical genomics now opens up these and other opportunities to "personalize" agriculture. Furthermore, chemical genomics does not necessarily require genetically tractable plant models, which in principle should allow quick translation to practical applications. For this to happen, however, will require collaboration between the Ag-biotech industry and academic labs for early stage research and development, a situation that has proven very fruitful for Big Pharma.

  17. Eyes wide open: the personal genome project, citizen science and veracity in informed consent

    PubMed Central

    Angrist, Misha

    2012-01-01

    I am a close observer of the Personal Genome Project (PGP) and one of the original ten participants. The PGP was originally conceived as a way to test novel DNA sequencing technologies on human samples and to begin to build a database of human genomes and traits. However, its founder, Harvard geneticist George Church, was concerned about the fact that DNA is the ultimate digital identifier – individuals and many of their traits can be identified. Therefore, he believed that promising participants privacy and confidentiality would be impractical and disingenuous. Moreover, deidentification of samples would impoverish both genotypic and phenotypic data. As a result, the PGP has arguably become best known for its unprecedented approach to informed consent. All participants must pass an exam testing their knowledge of genomic science and privacy issues and agree to forgo the privacy and confidentiality of their genomic data and personal health records. Church aims to scale up to 100,000 participants. This special report discusses the impetus for the project, its early history and its potential to have a lasting impact on the treatment of human subjects in biomedical research. PMID:22328898

  18. Perceptions of genetic counseling services in direct-to-consumer personal genomic testing.

    PubMed

    Darst, B F; Madlensky, L; Schork, N J; Topol, E J; Bloss, C S

    2013-10-01

    To describe consumers' perceptions of genetic counseling services in the context of direct-to-consumer personal genomic testing is the purpose of this research. Utilizing data from the Scripps Genomic Health Initiative, we assessed direct-to-consumer genomic test consumers' utilization and perceptions of genetic counseling services. At long-term follow-up, approximately 14 months post-testing, participants were asked to respond to several items gauging their interactions, if any, with a Navigenics genetic counselor, and their perceptions of those interactions. Out of 1325 individuals who completed long-term follow-up, 187 (14.1%) indicated that they had spoken with a genetic counselor. The most commonly given reason for not utilizing the counseling service was a lack of need due to the perception of already understanding one's results (55.6%). The most common reasons for utilizing the service included wanting to take advantage of a free service (43.9%) and wanting more information on risk calculations (42.2%). Among those who utilized the service, a large fraction reported that counseling improved their understanding of their results (54.5%) and genetics in general (43.9%). A relatively small proportion of participants utilized genetic counseling after direct-to-consumer personal genomic testing. Among those individuals who did utilize the service, however, a large fraction perceived it to be informative, and thus presumably beneficial.

  19. Cardiovascular pharmacogenetics: a promise for genomically-guided therapy and personalized medicine.

    PubMed

    Zaiou, M; El Amri, H

    2017-03-01

    Cardiovascular disease (CVD) is the leading cause of death worldwide. The basic causes of CVD are not fully understood yet. Substantial evidence suggests that genetic predisposition plays a vital role in the physiopathology of this complex disease. Hence, identification of genetic contributors to CVD will likely add diagnostic accuracy and better prediction of an individual's risk. With high-throughput genetics and genomics technology and newer genome-wide study approaches, a number of genetic variations across the human genome were uncovered. Evidence suggests that genetic defects could influence CVD development and inter-individual responses to widely used cardiovascular drugs like clopidogrel, aspirin, warfarin, and statins, and therefore, they may be integrated into clinical practice. If clinically validated, better understanding of these genetic variations may provide new opportunities for personalized diagnostic, pharmacogenetic-based drug selection and best treatment in personalized medicine. However, numerous gaps remain unsolved due to the lack of underlying pathological mechanisms for how genetic predisposition could contribute to CVD. This review provides an overview of the extraordinary scientific progress in our understanding of genetic and genomic basis of CVD as well as the development of relevant genetic biomarkers for this disease. Some of the actual limitations to the promise of these markers and their translation for the benefit of patients will be discussed.

  20. Eyes wide open: the personal genome project, citizen science and veracity in informed consent.

    PubMed

    Angrist, Misha

    2009-11-01

    I am a close observer of the Personal Genome Project (PGP) and one of the original ten participants. The PGP was originally conceived as a way to test novel DNA sequencing technologies on human samples and to begin to build a database of human genomes and traits. However, its founder, Harvard geneticist George Church, was concerned about the fact that DNA is the ultimate digital identifier - individuals and many of their traits can be identified. Therefore, he believed that promising participants privacy and confidentiality would be impractical and disingenuous. Moreover, deidentification of samples would impoverish both genotypic and phenotypic data. As a result, the PGP has arguably become best known for its unprecedented approach to informed consent. All participants must pass an exam testing their knowledge of genomic science and privacy issues and agree to forgo the privacy and confidentiality of their genomic data and personal health records. Church aims to scale up to 100,000 participants. This special report discusses the impetus for the project, its early history and its potential to have a lasting impact on the treatment of human subjects in biomedical research.

  1. Penalized feature selection and classification in bioinformatics

    PubMed Central

    Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classification techniques—which belong to the family of embedded feature selection methods—for bioinformatics studies with high-dimensional input. Classification objective functions, penalty functions and computational algorithms are discussed. Our goal is to make interested researchers aware of these feature selection and classification methods that are applicable to high-dimensional bioinformatics data. PMID:18562478

  2. Genome-wide association study of personality traits in the long life family study.

    PubMed

    Bae, Harold T; Sebastiani, Paola; Sun, Jenny X; Andersen, Stacy L; Daw, E Warwick; Terracciano, Antonio; Ferrucci, Luigi; Perls, Thomas T

    2013-01-01

    Personality traits have been shown to be associated with longevity and healthy aging. In order to discover novel genetic modifiers associated with personality traits as related with longevity, we performed a genome-wide association study (GWAS) on personality factors assessed by NEO-five-factor inventory in individuals enrolled in the Long Life Family Study (LLFS), a study of 583 families (N up to 4595) with clustering for longevity in the United States and Denmark. Three SNPs, in almost perfect LD, associated with agreeableness reached genome-wide significance (p < 10(-8)) and replicated in an additional sample of 1279 LLFS subjects, although one (rs9650241) failed to replicate and the other two were not available in two independent replication cohorts, the Baltimore Longitudinal Study of Aging and the New England Centenarian Study. Based on 10,000,000 permutations, the empirical p-value of 2 × 10(-7) was observed for the genome-wide significant SNPs. Seventeen SNPs that reached marginal statistical significance in the two previous GWASs (p-value <10(-4) and 10(-5)), were also marginally significantly associated in this study (p-value <0.05), although none of the associations passed the Bonferroni correction. In addition, we tested age-by-SNP interactions and found some significant associations. Since scores of personality traits in LLFS subjects change in the oldest ages, and genetic factors outweigh environmental factors to achieve extreme ages, these age-by-SNP interactions could be a proxy for complex gene-gene interactions affecting personality traits and longevity.

  3. Canaries in the coal mine: Personal and professional impact of undergoing whole genome sequencing on medical professionals.

    PubMed

    Zierhut, Heather; McCarthy Veach, Patricia; LeRoy, Bonnie

    2015-11-01

    Public interest in personal whole genome sequencing is increasing. The technology is publicly available and is being used as an educational tool in higher education. Empirical evidence regarding its utility is vital. The goals of this study were to characterize the process of whole genome sequencing in a population of medical and basic science professionals undergoing whole genome sequencing as a part of an educational symposium. Thirty-eight individuals completed one or more surveys from the time of informed consent for whole genome sequencing to 3 months post-symposium. The four surveys assessed demographics, decision-making, communication, decision regret, and personal and professional impact. The most prevalent motivation to participate was professional enhancement, followed by curiosity about the technology, and personal health benefits. The most important initial impact concerned medical implications. Over time, however, impact on professional development was greater than on personal health. Anticipated reactions to receiving whole genome sequencing results generally matched participants' actual reactions and decision regret remained low over time. Benefits and risks of whole genome sequencing included medically actionable results and misunderstanding by healthcare providers. Whole genome sequencing generally had a positive impact professionally and personally on participants. Further education of providers and the public about whole genome sequencing and psychosocial support is warranted.

  4. Primary care providers’ experiences with and perceptions of personalized genomic medicine

    PubMed Central

    Carroll, June C.; Makuwaza, Tutsirai; Manca, Donna P.; Sopcak, Nicolette; Permaul, Joanne A.; O’Brien, Mary Ann; Heisey, Ruth; Eisenhauer, Elizabeth A.; Easley, Julie; Krzyzanowska, Monika K.; Miedema, Baukje; Pruthi, Sandhya; Sawka, Carol; Schneider, Nancy; Sussman, Jonathan; Urquhart, Robin; Versaevel, Catarina; Grunfeld, Eva

    2016-01-01

    Abstract Objective To assess primary care providers’ (PCPs’) experiences with, perceptions of, and desired role in personalized medicine, with a focus on cancer. Design Qualitative study involving focus groups. Setting Urban and rural interprofessional primary care team practices in Alberta and Ontario. Participants Fifty-one PCPs. Methods Semistructured focus groups were conducted and audiorecorded. Recordings were transcribed and analyzed using techniques informed by grounded theory including coding, interpretations of patterns in the data, and constant comparison. Main findings Five focus groups with the 51 participants were conducted; 2 took place in Alberta and 3 in Ontario. Primary care providers described limited experience with personalized medicine, citing breast cancer and prenatal care as main areas of involvement. They expressed concern over their lack of knowledge, in some circumstances relying on personal experiences to inform their attitudes and practice. Participants anticipated an inevitable role in personalized medicine primarily because patients seek and trust their advice; however, there was underlying concern about the magnitude of information and pace of discovery in this area, particularly in direct-to-consumer personal genomic testing. Increased knowledge, closer ties to genetics specialists, and relevant, reliable personalized medicine resources accessible at the point of care were reported as important for successful implementation of personalized medicine. Conclusion Primary care providers are prepared to discuss personalized medicine, but they require better resources. Models of care that support a more meaningful relationship between PCPs and genetics specialists should be pursued. Continuing education strategies need to address knowledge gaps including direct-to-consumer genetic testing, a relatively new area provoking PCP concern. Primary care providers should be mindful of using personal experiences to guide care. PMID:27737998

  5. Systematic Pharmacogenomics Analysis of a Malay Whole Genome: Proof of Concept for Personalized Medicine

    PubMed Central

    Salleh, Mohd Zaki; Teh, Lay Kek; Lee, Lian Shien; Ismet, Rose Iszati; Patowary, Ashok; Joshi, Kandarp; Pasha, Ayesha; Ahmed, Azni Zain; Janor, Roziah Mohd; Hamzah, Ahmad Sazali; Adam, Aishah; Yusoff, Khalid; Hoh, Boon Peng; Hatta, Fazleen Haslinda Mohd; Ismail, Mohamad Izwan; Scaria, Vinod; Sivasubbu, Sridhar

    2013-01-01

    Background With a higher throughput and lower cost in sequencing, second generation sequencing technology has immense potential for translation into clinical practice and in the realization of pharmacogenomics based patient care. The systematic analysis of whole genome sequences to assess patient to patient variability in pharmacokinetics and pharmacodynamics responses towards drugs would be the next step in future medicine in line with the vision of personalizing medicine. Methods Genomic DNA obtained from a 55 years old, self-declared healthy, anonymous male of Malay descent was sequenced. The subject's mother died of lung cancer and the father had a history of schizophrenia and deceased at the age of 65 years old. A systematic, intuitive computational workflow/pipeline integrating custom algorithm in tandem with large datasets of variant annotations and gene functions for genetic variations with pharmacogenomics impact was developed. A comprehensive pathway map of drug transport, metabolism and action was used as a template to map non-synonymous variations with potential functional consequences. Principal Findings Over 3 million known variations and 100,898 novel variations in the Malay genome were identified. Further in-depth pharmacogenetics analysis revealed a total of 607 unique variants in 563 proteins, with the eventual identification of 4 drug transport genes, 2 drug metabolizing enzyme genes and 33 target genes harboring deleterious SNVs involved in pharmacological pathways, which could have a potential role in clinical settings. Conclusions The current study successfully unravels the potential of personal genome sequencing in understanding the functionally relevant variations with potential influence on drug transport, metabolism and differential therapeutic outcomes. These will be essential for realizing personalized medicine through the use of comprehensive computational pipeline for systematic data mining and analysis. PMID:24009664

  6. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    ERIC Educational Resources Information Center

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  7. Towards personalized agriculture: what chemical genomics can bring to plant biotechnology

    PubMed Central

    Stokes, Michael E.; McCourt, Peter

    2014-01-01

    In contrast to the dominant drug paradigm in which compounds were developed to “fit all,” new models focused around personalized medicine are appearing in which treatments are developed and customized for individual patients. The agricultural biotechnology industry (Ag-biotech) should also think about these new personalized models. For example, most common herbicides are generic in action, which led to the development of genetically modified crops to add specificity. The ease and accessibility of modern genomic analysis, when wedded to accessible large chemical space, should facilitate the discovery of chemicals that are more selective in their utility. Is it possible to develop species-selective herbicides and growth regulators? More generally put, is plant research at a stage where chemicals can be developed that streamline plant development and growth to various environments? We believe the advent of chemical genomics now opens up these and other opportunities to “personalize” agriculture. Furthermore, chemical genomics does not necessarily require genetically tractable plant models, which in principle should allow quick translation to practical applications. For this to happen, however, will require collaboration between the Ag-biotech industry and academic labs for early stage research and development, a situation that has proven very fruitful for Big Pharma. PMID:25183965

  8. Adopting Genetics: Motivations and Outcomes of Personal Genomic Testing in Adult Adoptees

    PubMed Central

    Baptista, Natalie M.; Christensen, Kurt D.; Carere, Deanna Alexis; Broadley, Simon A.; Roberts, J. Scott; Green, Robert C.

    2015-01-01

    Purpose American adult adoptees may possess limited amounts of information about their biological families and turn to direct-to-consumer personal genomic testing (PGT) for genealogical and medical information. We investigated the motivations and outcomes of adoptees undergoing PGT using data from the Impact of Personal Genomics (PGen) Study. Methods The PGen Study surveyed new 23andMe and Pathway Genomics customers prior to and 6 months after receiving PGT results. Exploratory analyses compared adoptees’ and non-adoptees’ PGT attitudes, expectations, and experiences. We evaluated the association of adoption status with motivations for testing and post-disclosure actions using logistic regression models. Results Of 1607 participants, 80 (5%) were adopted. As compared to non-adoptees, adoptees were more likely to cite limited family health history knowledge (OR = 10.1; 95% CI = 5.7–19.5) and the opportunity to learn genetic disease risks (OR = 2.7; 95% CI = 1.6–4.8) as strong motivations for PGT. Of 922 participants who completed 6-month follow-up, there was no significant association between adoption status and PGT-motivated healthcare utilization or health behavior change. Conclusion PGT allows adoptees to gain otherwise inaccessible information about their genetic disease risks and ancestry, helping them to fill the void of an incomplete family health history. PMID:26820063

  9. Analysis of the whole mitochondrial genome: translation of the Ion Torrent Personal Genome Machine system to the diagnostic bench?

    PubMed

    Seneca, Sara; Vancampenhout, Kim; Van Coster, Rudy; Smet, Joél; Lissens, Willy; Vanlander, Arnaud; De Paepe, Boel; Jonckheere, An; Stouffs, Katrien; De Meirleir, Linda

    2015-01-01

    Next-generation sequencing (NGS), an innovative sequencing technology that enables the successful analysis of numerous gene sequences in a massive parallel sequencing approach, has revolutionized the field of molecular biology. Although NGS was introduced in a rather recent past, the technology has already demonstrated its potential and effectiveness in many research projects, and is now on the verge of being introduced into the diagnostic setting of routine laboratories to delineate the molecular basis of genetic disease in undiagnosed patient samples. We tested a benchtop device on retrospective genomic DNA (gDNA) samples of controls and patients with a clinical suspicion of a mitochondrial DNA disorder. This Ion Torrent Personal Genome Machine platform is a high-throughput sequencer with a fast turnaround time and reasonable running costs. We challenged the chemistry and technology with the analysis and processing of a mutational spectrum composed of samples with single-nucleotide substitutions, indels (insertions and deletions) and large single or multiple deletions, occasionally in heteroplasmy. The output data were compared with previously obtained conventional dideoxy sequencing results and the mitochondrial revised Cambridge Reference Sequence (rCRS). We were able to identify the majority of all nucleotide alterations, but three false-negative results were also encountered in the data set. At the same time, the poor performance of the PGM instrument in regions associated with homopolymeric stretches generated many false-positive miscalls demanding additional manual curation of the data.

  10. An Online Bioinformatics Curriculum

    PubMed Central

    Searls, David B.

    2012-01-01

    Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field. PMID:23028269

  11. Genomic research and data-mining technology: implications for personal privacy and informed consent.

    PubMed

    Tavani, Herman T

    2004-01-01

    This essay examines issues involving personal privacy and informed consent that arise at the intersection of information and communication technology (ICT) and population genomics research. I begin by briefly examining the ethical, legal, and social implications (ELSI) program requirements that were established to guide researchers working on the Human Genome Project (HGP). Next I consider a case illustration involving deCODE Genetics, a privately owned genetic company in Iceland, which raises some ethical concerns that are not clearly addressed in the current ELSI guidelines. The deCODE case also illustrates some ways in which an ICT technique known as data mining has both aided and posed special challenges for researchers working in the field of population genomics. On the one hand, data-mining tools have greatly assisted researchers in mapping the human genome and in identifying certain "disease genes" common in specific populations (which, in turn, has accelerated the process of finding cures for diseases tha affect those populations). On the other hand, this technology has significantly threatened the privacy of research subjects participating in population genomics studies, who may, unwittingly, contribute to the construction of new groups (based on arbitrary and non-obvious patterns and statistical correlations) that put those subjects at risk for discrimination and stigmatization. In the final section of this paper I examine some ways in which the use of data mining in the context of population genomics research poses a critical challenge for the principle of informed consent, which traditionally has played a central role in protecting the privacy interests of research subjects participating in epidemiological studies.

  12. Genomic translational research: Paving the way to individualized cardiac functional analyses and personalized cardiology.

    PubMed

    Pasipoularides, Ares

    2017-03-01

    For most of Medicine's past, the best that physicians could do to cope with disease prevention and treatment was based on the expected response of an average patient. Currently, however, a more personalized/precise approach to cardiology and medicine in general is becoming possible, as the cost of sequencing a human genome has declined substantially. As a result, we are witnessing an era of precipitous advances in biomedicine and bourgeoning understanding of the genetic basis of cardiovascular and other diseases, reminiscent of the resurgence of innovations in physico-mathematical sciences and biology-anatomy-cardiology in the Renaissance, a parallel time of radical change and reformation of medical knowledge, education and practice. Now on the horizon is an individualized, diverse patient-centered, approach to medical practice that encompasses the development of new, gene-based diagnostics and preventive medicine tactics, and offers the broadest range of personalized therapies based on pharmacogenetics. Over time, translation of genomic and high-tech approaches unquestionably will transform clinical practice in cardiology and medicine as a whole, with the adoption of new personalized medicine approaches and procedures. Clearly, future prospects far outweigh present accomplishments, which are best viewed as a promising start. It is now essential for pluridisciplinary health care providers to examine the drivers and barriers to the clinical adoption of this emerging revolutionary paradigm, in order to expedite the realization of its potential. So, we are not there yet, but we are definitely on our way.

  13. A national clinical decision support infrastructure to enable the widespread and consistent practice of genomic and personalized medicine

    PubMed Central

    2009-01-01

    Background In recent years, the completion of the Human Genome Project and other rapid advances in genomics have led to increasing anticipation of an era of genomic and personalized medicine, in which an individual's health is optimized through the use of all available patient data, including data on the individual's genome and its downstream products. Genomic and personalized medicine could transform healthcare systems and catalyze significant reductions in morbidity, mortality, and overall healthcare costs. Discussion Critical to the achievement of more efficient and effective healthcare enabled by genomics is the establishment of a robust, nationwide clinical decision support infrastructure that assists clinicians in their use of genomic assays to guide disease prevention, diagnosis, and therapy. Requisite components of this infrastructure include the standardized representation of genomic and non-genomic patient data across health information systems; centrally managed repositories of computer-processable medical knowledge; and standardized approaches for applying these knowledge resources against patient data to generate and deliver patient-specific care recommendations. Here, we provide recommendations for establishing a national decision support infrastructure for genomic and personalized medicine that fulfills these needs, leverages existing resources, and is aligned with the Roadmap for National Action on Clinical Decision Support commissioned by the U.S. Office of the National Coordinator for Health Information Technology. Critical to the establishment of this infrastructure will be strong leadership and substantial funding from the federal government. Summary A national clinical decision support infrastructure will be required for reaping the full benefits of genomic and personalized medicine. Essential components of this infrastructure include standards for data representation; centrally managed knowledge repositories; and standardized approaches for

  14. Bioinformatic Analysis of Gene Expression for Melanoma Treatment

    PubMed Central

    Kawakami, Akinori; Fisher, David E.

    2016-01-01

    Bioinformatic analysis of genome-wide gene expression allows us to characterize cells, including melanomas. Gene expression profiles have been generated in various stages of melanomas and analyzed by researchers in unique ways. Lauss et al. compared their melanoma subtypes with those of The Cancer Genome Atlas Network and found consistency between the two studies. PMID:27884291

  15. Teaching bioinformatics to engineers.

    PubMed

    Mihalas, George I; Tudor, Anca; Paralescu, Sorin; Andor, Minodora; Stoicu-Tivadar, Lacramioara

    2014-01-01

    The paper refers to our methodology and experience in establishing the content of the course in bioinformatics introduced to the school of "Information Systems in Healthcare" (SIIS), master level. The syllabi of both lectures and laboratory works are presented and discussed.

  16. Bioinformatics Methods and Tools to Advance Clinical Care

    PubMed Central

    Lecroq, T.

    2015-01-01

    Summary Objectives To summarize excellent current research in the field of Bioinformatics and Translational Informatics with application in the health domain and clinical care. Method We provide a synopsis of the articles selected for the IMIA Yearbook 2015, from which we attempt to derive a synthetic overview of current and future activities in the field. As last year, a first step of selection was performed by querying MEDLINE with a list of MeSH descriptors completed by a list of terms adapted to the section. Each section editor has evaluated separately the set of 1,594 articles and the evaluation results were merged for retaining 15 articles for peer-review. Results The selection and evaluation process of this Yearbook’s section on Bioinformatics and Translational Informatics yielded four excellent articles regarding data management and genome medicine that are mainly tool-based papers. In the first article, the authors present PPISURV a tool for uncovering the role of specific genes in cancer survival outcome. The second article describes the classifier PredictSNP which combines six performing tools for predicting disease-related mutations. In the third article, by presenting a high-coverage map of the human proteome using high resolution mass spectrometry, the authors highlight the need for using mass spectrometry to complement genome annotation. The fourth article is also related to patient survival and decision support. The authors present datamining methods of large-scale datasets of past transplants. The objective is to identify chances of survival. Conclusions The current research activities still attest the continuous convergence of Bioinformatics and Medical Informatics, with a focus this year on dedicated tools and methods to advance clinical care. Indeed, there is a need for powerful tools for managing and interpreting complex, large-scale genomic and biological datasets, but also a need for user-friendly tools developed for the clinicians in their

  17. Bioinformatics prediction of miRNAs in the Prunus persica genome with validation of their precise sequences by miR-RACE.

    PubMed

    Zhang, Yanping; Bai, Youhuang; Han, Jian; Chen, Ming; Kayesh, Emrul; Jiang, Weibing; Fang, Jinggui

    2013-01-01

    We predicted 262 potential MicroRNAs (miRNAs) belonging to 70 miRNA families from the peach (Prunus persica) genome and two specific 5' and 3' miRNA rapid amplification of cDNA ends (miR-RACE) PCR reactions and sequence-directed cloning were employed to accurately validate 61 unique P. persica miRNAs (Ppe-miRNAs) sequences belonging to 61 families comprising 97 Ppe-miRNAs. Validation of the termini nucleotides in particular can define the real sequences of the Ppe-miRNAs on peach genome. Comparison between predicted and validated Ppe-miRNAs through alignment revealed that 43 unique orthologous sequences were identical, while the remaining 18 exhibited some divergences at their termini nucleotides. Quantitative real-time polymerase chain reaction (qRT-PCR) was further employed to analyze the expression of all the 61 miRNAs and 10 putative targets of 8 randomly selected Ppe-miRNAs in peach leaves, flowers and fruits at different stages of development, where both the miRNAs and the putative target genes showed tissue-specific expression.

  18. Bioinformatic Identification of Rare Codon Clusters (RCCs) in HBV Genome and Evaluation of RCCs in Proteins Structure of Hepatitis B Virus

    PubMed Central

    Mortazavi, Mojtaba; Zarenezhad, Mohammad; Gholamzadeh, Saeid; Alavian, Seyed Moayed; Ghorbani, Mohammad; Dehghani, Reza; Malekpour, Abdorrasoul; Meshkibaf, Mohammadhasan; Fakhrzad, Ali

    2016-01-01

    Background Hepatitis B virus (HBV) as an infectious disease that has nine genotypes (A - I) and a ‘putative’ genotype J. Objectives The aim of this study was to identify the rare codon clusters (RCC) in the HBV genome and to evaluate these RCCs in the HBV proteins structure. Methods For detection of protein family accession numbers (Pfam) in HBV proteins, the UniProt database and Pfam search tool were used. Protein family accession numbers is a comprehensive and accurate collection of protein domains and families. It contains annotation of each family in the form of textual descriptions, links to other resources and literature references. Genome projects have used Pfam extensively for large-scale functional annotation of genomic data; Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). The Pfam search tools are databases that identify Pfam of proteins. These Pfam IDs were analyzed in Sherlocc program and the location of RCCs in HBV genome and proteins were detected and reported as translated EMBL nucleotide sequence data library (TrEMBL) entries. The TrEMBL is a computer-annotated supplement of SWISS-PROT that contains all the translations of European molecular biology laboratory (EMBL) nucleotide sequence entries not yet integrated in SWISS-PROT. Furthermore, the structures of TrEMBL entries proteins were studied in the PDB database and 3D structures of the HBV proteins and locations of RCCs were visualized and studied using Swiss PDB Viewer software®. Results The Pfam search tool found nine protein families in three frames. Results of Pfams studies in the Sherlocc program showed that this program has not identified RCCs in the external core antigen (PF08290) and truncated HBeAg gene (PF08290) of HBV. By contrast, the RCCs were identified in gene of hepatitis core antigen (PF00906 and the residues 224 - 234 and 251 - 255), large envelope protein S (PF00695 and the residues

  19. Towards a career in bioinformatics

    PubMed Central

    2009-01-01

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508

  20. The Bioinformatics Analysis of Comparative Genomics of Mycobacterium tuberculosis Complex (MTBC) Provides Insight into Dissimilarities between Intraspecific Groups Differing in Host Association, Virulence, and Epitope Diversity

    PubMed Central

    Jia, Xinmiao; Yang, Li; Dong, Mengxing; Chen, Suting; Lv, Lingna; Cao, Dandan; Fu, Jing; Yang, Tingting; Zhang, Ju; Zhang, Xiangli; Shang, Yuanyuan; Wang, Guirong; Sheng, Yongjie; Huang, Hairong; Chen, Fei

    2017-01-01

    Tuberculosis now exceeds HIV as the top infectious disease cause of mortality, and is caused by the Mycobacterium tuberculosis complex (MTBC). MTBC strains have highly conserved genome sequences (similarity >99%) but dramatically different phenotypes. To analyze the relationship between genotype and phenotype, we conducted the comparative genomic analysis on 12 MTBC strains representing different lineages (i.e., Mycobacterium bovis; M. bovis BCG; M. microti; M. africanum; M. tuberculosis H37Rv; M. tuberculosis H37Ra, and six M. tuberculosis clinical isolates). The analysis focused on the three aspects of pathogenicity: host association, virulence, and epitope variations. Host association analysis indicated that eight mce3 genes, two enoyl-CoA hydratases, and five PE/PPE family genes were present only in human isolates; these may have roles in host-pathogen interactions. There were 15 SNPs found on virulence factors (including five SNPs in three ESX secretion proteins) only in the Beijing strains, which might be related to their more virulent phenotype. A comparison between the virulent H37Rv and non-virulent H37Ra strains revealed three SNPs that were likely associated with the virulence attenuation of H37Ra: S219L (PhoP), A219E (MazG) and a newly identified I228M (EspK). Additionally, a comparison of animal-associated MTBC strains showed that the deletion of the first four genes (i.e., pe35, ppe68, esxB, esxA), rather than all eight genes of RD1, might play a central role in the virulence attenuation of animal isolates. Finally, by comparing epitopes among MTBC strains, we found that four epitopes were lost only in the Beijing strains; this may render them better capable of evading the human immune system, leading to enhanced virulence. Overall, our comparative genomic analysis of MTBC strains reveals the relationship between the highly conserved genotypes and the diverse phenotypes of MTBC, provides insight into pathogenic mechanisms, and facilitates the

  1. From Molecules to Patients: The Clinical Applications of Translational Bioinformatics

    PubMed Central

    Regan, K.

    2015-01-01

    Summary Objective In order to realize the promise of personalized medicine, Translational Bioinformatics (TBI) research will need to continue to address implementation issues across the clinical spectrum. In this review, we aim to evaluate the expanding field of TBI towards clinical applications, and define common themes and current gaps in order to motivate future research. Methods Here we present the state-of-the-art of clinical implementation of TBI-based tools and resources. Our thematic analyses of a targeted literature search of recent TBI-related articles ranged across topics in genomics, data management, hypothesis generation, molecular epidemiology, diagnostics, therapeutics and personalized medicine. Results Open areas of clinically-relevant TBI research identified in this review include developing data standards and best practices, publicly available resources, integrative systems-level approaches, user-friendly tools for clinical support, cloud computing solutions, emerging technologies and means to address pressing legal, ethical and social issues. Conclusions There is a need for further research bridging the gap from foundational TBI-based theories and methodologies to clinical implementation. We have organized the topic themes presented in this review into four conceptual foci – domain analyses, knowledge engineering, computational architectures and computation methods alongside three stages of knowledge development in order to orient future TBI efforts to accelerate the goals of personalized medicine. PMID:26293863

  2. Bioinformatics for Exploration

    NASA Technical Reports Server (NTRS)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  3. Distributed computing in bioinformatics.

    PubMed

    Jain, Eric

    2002-01-01

    This paper provides an overview of methods and current applications of distributed computing in bioinformatics. Distributed computing is a strategy of dividing a large workload among multiple computers to reduce processing time, or to make use of resources such as programs and databases that are not available on all computers. Participating computers may be connected either through a local high-speed network or through the Internet.

  4. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  5. Genetics, genomics, and cancer risk assessment: State of the Art and Future Directions in the Era of Personalized Medicine.

    PubMed

    Weitzel, Jeffrey N; Blazer, Kathleen R; MacDonald, Deborah J; Culver, Julie O; Offit, Kenneth

    2011-01-01

    Scientific and technologic advances are revolutionizing our approach to genetic cancer risk assessment, cancer screening and prevention, and targeted therapy, fulfilling the promise of personalized medicine. In this monograph, we review the evolution of scientific discovery in cancer genetics and genomics, and describe current approaches, benefits, and barriers to the translation of this information to the practice of preventive medicine. Summaries of known hereditary cancer syndromes and highly penetrant genes are provided and contrasted with recently discovered genomic variants associated with modest increases in cancer risk. We describe the scope of knowledge, tools, and expertise required for the translation of complex genetic and genomic test information into clinical practice. The challenges of genomic counseling include the need for genetics and genomics professional education and multidisciplinary team training, the need for evidence-based information regarding the clinical utility of testing for genomic variants, the potential dangers posed by premature marketing of first-generation genomic profiles, and the need for new clinical models to improve access to and responsible communication of complex disease risk information. We conclude that given the experiences and lessons learned in the genetics era, the multidisciplinary model of genetic cancer risk assessment and management will serve as a solid foundation to support the integration of personalized genomic information into the practice of cancer medicine.

  6. Genetics, Genomics and Cancer Risk Assessment: State of the art and future directions in the era of personalized medicine

    PubMed Central

    Weitzel, Jeffrey N.; Blazer, Kathleen R.; MacDonald, Deborah J.; Culver, Julie O.; Offit, Kenneth

    2012-01-01

    Scientific and technologic advances are revolutionizing our approach to genetic cancer risk assessment, cancer screening and prevention, and targeted therapy, fulfilling the promise of personalized medicine. In this monograph we review the evolution of scientific discovery in cancer genetics and genomics, and describe current approaches, benefits and barriers to the translation of this information to the practice of preventive medicine. Summaries of known hereditary cancer syndromes and highly penetrant genes are provided and contrasted with recently-discovered genomic variants associated with modest increases in cancer risk. We describe the scope of knowledge, tools, and expertise required for the translation of complex genetic and genomic test information into clinical practice. The challenges of genomic counseling include the need for genetics and genomics professional education and multidisciplinary team training, the need for evidence-based information regarding the clinical utility of testing for genomic variants, the potential dangers posed by premature marketing of first-generation genomic profiles, and the need for new clinical models to improve access to and responsible communication of complex disease-risk information. We conclude that given the experiences and lessons learned in the genetics era, the multidisciplinary model of genetic cancer risk assessment and management will serve as a solid foundation to support the integration of personalized genomic information into the practice of cancer medicine. PMID:21858794

  7. CryptoDB: a Cryptosporidium bioinformatics resource update.

    PubMed

    Heiges, Mark; Wang, Haiming; Robinson, Edward; Aurrecoechea, Cristina; Gao, Xin; Kaluskar, Nivedita; Rhodes, Philippa; Wang, Sammy; He, Cong-Zhou; Su, Yanqi; Miller, John; Kraemer, Eileen; Kissinger, Jessica C

    2006-01-01

    The database, CryptoDB (http://CryptoDB.org), is a community bioinformatics resource for the AIDS-related apicomplexan-parasite, Cryptosporidium. CryptoDB integrates whole genome sequence and annotation with expressed sequence tag and genome survey sequence data and provides supplemental bioinformatics analyses and data-mining tools. A simple, yet comprehensive web interface is available for mining and visualizing the data. CryptoDB is allied with the databases PlasmoDB and ToxoDB via ApiDB, an NIH/NIAID-fundedBioinformatics Resource Center. Recent updates to CryptoDB include the deposition of annotated genome sequences for Cryptosporidium parvum and Cryptosporidium hominis, migration to a relational database (GUS), a new query and visualization interface and the introduction of Web services.

  8. RGS2 expression predicts amyloid-β sensitivity, MCI and Alzheimer's disease: genome-wide transcriptomic profiling and bioinformatics data mining

    PubMed Central

    Hadar, A; Milanesi, E; Squassina, A; Niola, P; Chillotti, C; Pasmanik-Chor, M; Yaron, O; Martásek, P; Rehavi, M; Weissglas-Volkov, D; Shomron, N; Gozes, I; Gurwitz, D

    2016-01-01

    Alzheimer's disease (AD) is the most frequent cause of dementia. Misfolded protein pathological hallmarks of AD are brain deposits of amyloid-β (Aβ) plaques and phosphorylated tau neurofibrillary tangles. However, doubts about the role of Aβ in AD pathology have been raised as Aβ is a common component of extracellular brain deposits found, also by in vivo imaging, in non-demented aged individuals. It has been suggested that some individuals are more prone to Aβ neurotoxicity and hence more likely to develop AD when aging brains start accumulating Aβ plaques. Here, we applied genome-wide transcriptomic profiling of lymphoblastoid cells lines (LCLs) from healthy individuals and AD patients for identifying genes that predict sensitivity to Aβ. Real-time PCR validation identified 3.78-fold lower expression of RGS2 (regulator of G-protein signaling 2; P=0.0085) in LCLs from healthy individuals exhibiting high vs low Aβ sensitivity. Furthermore, RGS2 showed 3.3-fold lower expression (P=0.0008) in AD LCLs compared with controls. Notably, RGS2 expression in AD LCLs correlated with the patients' cognitive function. Lower RGS2 expression levels were also discovered in published expression data sets from postmortem AD brain tissues as well as in mild cognitive impairment and AD blood samples compared with controls. In conclusion, Aβ sensitivity phenotyping followed by transcriptomic profiling and published patient data mining identified reduced peripheral and brain expression levels of RGS2, a key regulator of G-protein-coupled receptor signaling and neuronal plasticity. RGS2 is suggested as a novel AD biomarker (alongside other genes) toward early AD detection and future disease modifying therapeutics. PMID:27701409

  9. Personalized medicine beyond genomics: alternative futures in big data-proteomics, environtome and the social proteome.

    PubMed

    Özdemir, Vural; Dove, Edward S; Gürsoy, Ulvi K; Şardaş, Semra; Yıldırım, Arif; Yılmaz, Şenay Görücü; Ömer Barlas, I; Güngör, Kıvanç; Mete, Alper; Srivastava, Sanjeeva

    2017-01-01

    No field in science and medicine today remains untouched by Big Data, and psychiatry is no exception. Proteomics is a Big Data technology and a next generation biomarker, supporting novel system diagnostics and therapeutics in psychiatry. Proteomics technology is, in fact, much older than genomics and dates to the 1970s, well before the launch of the international Human Genome Project. While the genome has long been framed as the master or "elite" executive molecule in cell biology, the proteome by contrast is humble. Yet the proteome is critical for life-it ensures the daily functioning of cells and whole organisms. In short, proteins are the blue-collar workers of biology, the down-to-earth molecules that we cannot live without. Since 2010, proteomics has found renewed meaning and international attention with the launch of the Human Proteome Project and the growing interest in Big Data technologies such as proteomics. This article presents an interdisciplinary technology foresight analysis and conceptualizes the terms "environtome" and "social proteome". We define "environtome" as the entire complement of elements external to the human host, from microbiome, ambient temperature and weather conditions to government innovation policies, stock market dynamics, human values, political power and social norms that collectively shape the human host spatially and temporally. The "social proteome" is the subset of the environtome that influences the transition of proteomics technology to innovative applications in society. The social proteome encompasses, for example, new reimbursement schemes and business innovation models for proteomics diagnostics that depart from the "once-a-life-time" genotypic tests and the anticipated hype attendant to context and time sensitive proteomics tests. Building on the "nesting principle" for governance of complex systems as discussed by Elinor Ostrom, we propose here a 3-tiered organizational architecture for Big Data science such as

  10. Agile parallel bioinformatics workflow management using Pwrake

    PubMed Central

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  11. Genome-wide association uncovers shared genetic effects among personality traits and mood states

    PubMed Central

    Luciano, Michelle; Huffman, Jennifer E; Arias-Vásquez, Alejandro; Vinkhuyzen, Anna AE; Middeldorp, Christel M; Giegling, Ina; Payton, Antony; Davies, Gail; Zgaga, Lina; Janzing, Joost; Ke, Xiayi; Galesloot, Tessel; Hartmann, Annette M; Ollier, William; Tenesa, Albert; Hayward, Caroline; Verhagen, Maaike; Montgomery, Grant W; Hottenga, Jouke-Jan; Konte, Bettina; Starr, John M; Vitart, Veronique; Vos, Pieter E; Madden, Pamela AF; Willemsen, Gonneke; Konnerth, Heike; Horan, Michael A; Porteous, David J; Campbell, Harry; Vermeulen, Sita H; Heath, Andrew C; Wright, Alan; Polasek, Ozren; Kovacevic, Sanja B; Hastie, Nicholas D; Franke, Barbara; Boomsma, Dorret I; Martin, Nicholas G; Rujescu, Dan; Wilson, James F; Buitelaar, Jan; Pendleton, Neil; Rudan, Igor; Deary, Ian J

    2013-01-01

    Measures of personality and psychological distress are correlated and exhibit genetic covariance. We conducted univariate genome-wide SNP (~2.5 million) and gene-based association analyses of these traits and examined the overlap in results across traits, including a prediction analysis of mood states using genetic polygenic scores for personality. Measures of neuroticism, extraversion, and symptoms of anxiety, depression, and general psychological distress were collected in eight European cohorts (n ranged 546 to 1 338; maximum total n=6 268) whose mean age ranged from 55 to 79 years. Meta-analysis of the cohort results was performed, with follow-up associations of the top SNPs and genes investigated in independent cohorts (n=527 to 6 032). Suggestive association (P=8×10−8) of rs1079196 in the FHIT gene was observed with symptoms of anxiety. Other notable associations (P<6.09×10−6) included SNPs in five genes for neuroticism (LCE3C, POLR3A, LMAN1L, ULK3, SCAMP2), KIAA0802 for extraversion, and NOS1 for general psychological distress. An association between symptoms of depression and rs7582472 (near to MGAT5 and NCKAP5) was replicated in two independent samples, but other replication findings were less consistent. Gene-based tests identified a significant locus on chromosome 15 (spanning five genes) associated with neuroticism which replicated (P<0.05) in an independent cohort. Support for common genetic effects among personality and mood (particularly neuroticism and depressive symptoms) was found in terms of SNP association overlap and polygenic score prediction. The variance explained by individual SNPs was very small (up to 1%) confirming that there are no moderate/large effects of common SNPs on personality and related traits. PMID:22628180

  12. After the revolution? Ethical and social challenges in ‘personalized genomic medicine’

    PubMed Central

    Juengst, Eric T; Settersten, Richard A; Fishman, Jennifer R; McGowan, Michelle L

    2013-01-01

    Personalized genomic medicine (PGM) is a goal that currently unites a wide array of biomedical initiatives, and is promoted as a ‘new paradigm for healthcare’ by its champions. Its promissory virtues include individualized diagnosis and risk prediction, more effective prevention and health promotion, and patient empowerment. Beyond overcoming scientific and technological hurdles to realizing PGM, proponents may interpret and rank these promises differently, which carries ethical and social implications for the realization of PGM as an approach to healthcare. We examine competing visions of PGM’s virtues and the directions in which they could take the field, in order to anticipate policy choices that may lie ahead for researchers, healthcare providers and the public. PMID:23662108

  13. [Application of bioinformatics in researches of industrial biocatalysis].

    PubMed

    Yu, Hui-Min; Luo, Hui; Shi, Yue; Sun, Xu-Dong; Shen, Zhong-Yao

    2004-05-01

    Industrial biocatalysis is currently attracting much attention to rebuild or substitute traditional producing process of chemicals and drugs. One of key focuses in industrial biocatalysis is biocatalyst, which is usually one kind of microbial enzyme. In the recent, new technologies of bioinformatics have played and will continue to play more and more significant roles in researches of industrial biocatalysis in response to the waves of genomic revolution. One of the key applications of bioinformatics in biocatalysis is the discovery and identification of the new biocatalyst through advanced DNA and protein sequence search, comparison and analyses in Internet database using different algorithm and software. The unknown genes of microbial enzymes can also be simply harvested by primer design on the basis of bioinformatics analyses. The other key applications of bioinformatics in biocatalysis are the modification and improvement of existing industrial biocatalyst. In this aspect, bioinformatics is of great importance in both rational design and directed evolution of microbial enzymes. Based on the successful prediction of tertiary structures of enzymes using the tool of bioinformatics, the undermentioned experiments, i.e. site-directed mutagenesis, fusion protein construction, DNA family shuffling and saturation mutagenesis, etc, are usually of very high efficiency. On all accounts, bioinformatics will be an essential tool for either biologist or biological engineer in the future researches of industrial biocatalysis, due to its significant function in guiding and quickening the step of discovery and/or improvement of novel biocatalysts.

  14. Sequencing technologies and genome sequencing.

    PubMed

    Pareek, Chandra Shekhar; Smoczynski, Rafal; Tretyn, Andrzej

    2011-11-01

    The high-throughput - next generation sequencing (HT-NGS) technologies are currently the hottest topic in the field of human and animals genomics researches, which can produce over 100 times more data compared to the most sophisticated capillary sequencers based on the Sanger method. With the ongoing developments of high throughput sequencing machines and advancement of modern bioinformatics tools at unprecedented pace, the target goal of sequencing individual genomes of living organism at a cost of $1,000 each is seemed to be realistically feasible in the near future. In the relatively short time frame since 2005, the HT-NGS technologies are revolutionizing the human and animal genome researches by analysis of chromatin immunoprecipitation coupled to DNA microarray (ChIP-chip) or sequencing (ChIP-seq), RNA sequencing (RNA-seq), whole genome genotyping, genome wide structural variation, de novo assembling and re-assembling of genome, mutation detection and carrier screening, detection of inherited disorders and complex human diseases, DNA library preparation, paired ends and genomic captures, sequencing of mitochondrial genome and personal genomics. In this review, we addressed the important features of HT-NGS like, first generation DNA sequencers, birth of HT-NGS, second generation HT-NGS platforms, third generation HT-NGS platforms: including single molecule Heliscope™, SMRT™ and RNAP sequencers, Nanopore, Archon Genomics X PRIZE foundation, comparison of second and third HT-NGS platforms, applications, advances and future perspectives of sequencing technologies on human and animal genome research.

  15. A genome-wide linkage study of individuals with high scores on NEO personality traits.

    PubMed

    Amin, N; Schuur, M; Gusareva, E S; Isaacs, A; Aulchenko, Y S; Kirichenko, A V; Zorkoltseva, I V; Axenovich, T I; Oostra, B A; Janssens, A C J W; van Duijn, C M

    2012-10-01

    The NEO-Five-Factor Inventory divides human personality traits into five dimensions: neuroticism, extraversion, openness, conscientiousness and agreeableness. In this study, we sought to identify regions harboring genes with large effects on the five NEO personality traits by performing genome-wide linkage analysis of individuals scoring in the extremes of these traits (>90th percentile). Affected-only linkage analysis was performed using an Illumina 6K linkage array in a family-based study, the Erasmus Rucphen Family study. We subsequently determined whether distinct, segregating haplotypes found with linkage analysis were associated with the trait of interest in the population. Finally, a dense single-nucleotide polymorphism genotyping array (Illumina 318K) was used to search for copy number variations (CNVs) in the associated regions. In the families with extreme phenotype scores, we found significant evidence of linkage for conscientiousness to 20p13 (rs1434789, log of odds (LOD)=5.86) and suggestive evidence of linkage (LOD >2.8) for neuroticism to 19q, 21q and 22q, extraversion to 1p, 1q, 9p and12q, openness to 12q and 19q, and agreeableness to 2p, 6q, 17q and 21q. Further analysis determined haplotypes in 21q22 for neuroticism (P-values = 0.009, 0.007), in 17q24 for agreeableness (marginal P-value = 0.018) and in 20p13 for conscientiousness (marginal P-values = 0.058, 0.038) segregating in families with large contributions to the LOD scores. No evidence for CNVs in any of the associated regions was found. Our findings imply that there may be genes with relatively large effects involved in personality traits, which may be identified with next-generation sequencing techniques.

  16. Genome-wide association analysis of eating disorder-related symptoms, behaviors, and personality traits.

    PubMed

    Boraska, Vesna; Davis, Oliver S P; Cherkas, Lynn F; Helder, Sietske G; Harris, Juliette; Krug, Isabel; Liao, Thomas Pei-Chi; Treasure, Janet; Ntalla, Ioanna; Karhunen, Leila; Keski-Rahkonen, Anna; Christakopoulou, Danai; Raevuori, Anu; Shin, So-Youn; Dedoussis, George V; Kaprio, Jaakko; Soranzo, Nicole; Spector, Tim D; Collier, David A; Zeggini, Eleftheria

    2012-10-01

    Eating disorders (EDs) are common, complex psychiatric disorders thought to be caused by both genetic and environmental factors. They share many symptoms, behaviors, and personality traits, which may have overlapping heritability. The aim of the present study is to perform a genome-wide association scan (GWAS) of six ED phenotypes comprising three symptom traits from the Eating Disorders Inventory 2 [Drive for Thinness (DT), Body Dissatisfaction (BD), and Bulimia], Weight Fluctuation symptom, Breakfast Skipping behavior and Childhood Obsessive-Compulsive Personality Disorder trait (CHIRP). Investigated traits were derived from standardized self-report questionnaires completed by the TwinsUK population-based cohort. We tested 283,744 directly typed SNPs across six phenotypes of interest in the TwinsUK discovery dataset and followed-up signals from various strata using a two-stage replication strategy in two independent cohorts of European ancestry. We meta-analyzed a total of 2,698 individuals for DT, 2,680 for BD, 2,789 (821 cases/1,968 controls) for Bulimia, 1,360 (633 cases/727 controls) for Childhood Obsessive-Compulsive Personality Disorder trait, 2,773 (761 cases/2,012 controls) for Breakfast Skipping, and 2,967 (798 cases/2,169 controls) for Weight Fluctuation symptom. In this GWAS analysis of six ED-related phenotypes, we detected association of eight genetic variants with P < 10(-5) . Genetic variants that showed suggestive evidence of association were previously associated with several psychiatric disorders and ED-related phenotypes. Our study indicates that larger-scale collaborative studies will be needed to achieve the necessary power to detect loci underlying ED-related traits.

  17. Bioinformatics-Aided Venomics

    PubMed Central

    Kaas, Quentin; Craik, David J.

    2015-01-01

    Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future. PMID:26110505

  18. The patient as person in an increasingly gene-centric universe: how healthcare professionals should think about genomics and evolution.

    PubMed

    Jackson, Timothy P

    2009-02-15

    In the past, the primary threat to the patient as person was a medical utilitarianism that would sacrifice the individual for the collective, that would coercively (ab)use a person for the sake of an in-group's health or happiness. Today, the threat is not only from vainglorious social groups but also from valorized genes and genomes. An over-valuation of genes risks making persons seem epiphenomenal. A central thesis of this article is that religious healthcare professionals have unique resources to combat this.

  19. THE PATIENT AS PERSON IN AN INCREASINGLY GENE-CENTRIC UNIVERSE: HOW HEALTHCARE PROFESSIONALS SHOULD THINK ABOUT GENOMICS AND EVOLUTION

    PubMed Central

    Jackson, Timothy P.

    2009-01-01

    In the past, the primary threat to the patient as person was a medical utilitarianism that would sacrifice the individual for the collective, that would coercively (ab)use a person for the sake of an in-group’s health or happiness. Today, the threat is not only from vainglorious social groups but also from valorized genes and genomes. An over-valuation of genes risks making persons seem epiphenomenal. A central thesis of this paper is that religious healthcare professionals have unique resources to combat this. PMID:19170083

  20. Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression

    PubMed Central

    Yuan, Shuai; Qin, Zhaohui

    2014-01-01

    Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)’s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.com/p/diploid-mapping/downloads/list. PMID:25621316

  1. Read-mapping using personalized diploid reference genome for RNA sequencing data reduced bias for detecting allele-specific expression.

    PubMed

    Yuan, Shuai; Qin, Zhaohui

    2012-10-01

    Next generation sequencing (NGS) technologies have been applied extensively in many areas of genetics and genomics research. A fundamental problem when comes to analyzing NGS data is mapping short sequencing reads back to the reference genome. Most of existing software packages rely on a single uniform reference genome and do not automatically take into the consideration of genetic variants. On the other hand, large proportions of incorrectly mapped reads affect the correct interpretation of the NGS experimental results. As an example, Degner et al. showed that detecting allele-specific expression from RNA sequencing data was biased toward the reference allele. In this study, we developed a method that utilize DirectX 11 enabled graphics processing unit (GPU)'s parallel computing power to produces a personalized diploid reference genome based on all known genetic variants of that particular individual. We show that using such a personalized diploid reference genome can improve mapping accuracy and significantly reduce the bias toward reference allele in allele-specific expression analysis. Our method can be applied to any individual that has genotype information obtained either from array-based genotyping or resequencing. Besides the reference genome, no additional changes to alignment algorithm are needed for performing read mapping therefore one can utilize any of the existing read mapping tools and achieve the improved read mapping result. C++ and GPU compute shader source code of the software program is available at: http://code.google.com/p/diploid-mapping/downloads/list.

  2. A genome-wide association study of Cloninger's temperament scales: implications for the evolutionary genetics of personality.

    PubMed

    Verweij, Karin J H; Zietsch, Brendan P; Medland, Sarah E; Gordon, Scott D; Benyamin, Beben; Nyholt, Dale R; McEvoy, Brian P; Sullivan, Patrick F; Heath, Andrew C; Madden, Pamela A F; Henders, Anjali K; Montgomery, Grant W; Martin, Nicholas G; Wray, Naomi R

    2010-10-01

    Variation in personality traits is 30-60% attributed to genetic influences. Attempts to unravel these genetic influences at the molecular level have, so far, been inconclusive. We performed the first genome-wide association study of Cloninger's temperament scales in a sample of 5117 individuals, in order to identify common genetic variants underlying variation in personality. Participants' scores on Harm Avoidance, Novelty Seeking, Reward Dependence, and Persistence were tested for association with 1,252,387 genetic markers. We also performed gene-based association tests and biological pathway analyses. No genetic variants that significantly contribute to personality variation were identified, while our sample provides over 90% power to detect variants that explain only 1% of the trait variance. This indicates that individual common genetic variants of this size or greater do not contribute to personality trait variation, which has important implications regarding the genetic architecture of personality and the evolutionary mechanisms by which heritable variation is maintained.

  3. Integration of bioinformatics into an undergraduate biology curriculum and the impact on development of mathematical skills.

    PubMed

    Wightman, Bruce; Hark, Amy T

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative.

  4. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    ERIC Educational Resources Information Center

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  5. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    ERIC Educational Resources Information Center

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  6. Visualizing and Sharing Results in Bioinformatics Projects: GBrowse and GenBank Exports

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Effective tools for presenting and sharing data are necessary for collaborative projects, typical for bioinformatics. In order to facilitate sharing our data with other genomics, molecular biology, and bioinformatics researchers, we have developed software to export our data to GenBank and combined ...

  7. Virtual Bioinformatics Distance Learning Suite

    ERIC Educational Resources Information Center

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  8. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    ERIC Educational Resources Information Center

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  9. Bioinformatics of cardiovascular miRNA biology.

    PubMed

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'.

  10. Psychological and behavioural impact of returning personal results from whole-genome sequencing: the HealthSeq project

    PubMed Central

    Sanderson, Saskia C; Linderman, Michael D; Suckiel, Sabrina A; Zinberg, Randi; Wasserstein, Melissa; Kasarskis, Andrew; Diaz, George A; Schadt, Eric E

    2017-01-01

    Providing ostensibly healthy individuals with personal results from whole-genome sequencing could lead to improved health and well-being via enhanced disease risk prediction, prevention, and diagnosis, but also poses practical and ethical challenges. Understanding how individuals react psychologically and behaviourally will be key in assessing the potential utility of personal whole-genome sequencing. We conducted an exploratory longitudinal cohort study in which quantitative surveys and in-depth qualitative interviews were conducted before and after personal results were returned to individuals who underwent whole-genome sequencing. The participants were offered a range of interpreted results, including Alzheimer's disease, type 2 diabetes, pharmacogenomics, rare disease-associated variants, and ancestry. They were also offered their raw data. Of the 35 participants at baseline, 29 (82.9%) completed the 6-month follow-up. In the quantitative surveys, test-related distress was low, although it was higher at 1-week than 6-month follow-up (Z=2.68, P=0.007). In the 6-month qualitative interviews, most participants felt happy or relieved about their results. A few were concerned, particularly about rare disease-associated variants and Alzheimer's disease results. Two of the 29 participants had sought clinical follow-up as a direct or indirect consequence of rare disease-associated variants results. Several had mentioned their results to their doctors. Some participants felt having their raw data might be medically useful to them in the future. The majority reported positive reactions to having their genomes sequenced, but there were notable exceptions to this. The impact and value of returning personal results from whole-genome sequencing when implemented on a larger scale remains to be seen. PMID:28051073

  11. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    PubMed

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease.

  12. Meta-analysis of Genome-Wide Association Studies for Extraversion: Findings from the Genetics of Personality Consortium.

    PubMed

    van den Berg, Stéphanie M; de Moor, Marleen H M; Verweij, Karin J H; Krueger, Robert F; Luciano, Michelle; Arias Vasquez, Alejandro; Matteson, Lindsay K; Derringer, Jaime; Esko, Tõnu; Amin, Najaf; Gordon, Scott D; Hansell, Narelle K; Hart, Amy B; Seppälä, Ilkka; Huffman, Jennifer E; Konte, Bettina; Lahti, Jari; Lee, Minyoung; Miller, Mike; Nutile, Teresa; Tanaka, Toshiko; Teumer, Alexander; Viktorin, Alexander; Wedenoja, Juho; Abdellaoui, Abdel; Abecasis, Goncalo R; Adkins, Daniel E; Agrawal, Arpana; Allik, Jüri; Appel, Katja; Bigdeli, Timothy B; Busonero, Fabio; Campbell, Harry; Costa, Paul T; Smith, George Davey; Davies, Gail; de Wit, Harriet; Ding, Jun; Engelhardt, Barbara E; Eriksson, Johan G; Fedko, Iryna O; Ferrucci, Luigi; Franke, Barbara; Giegling, Ina; Grucza, Richard; Hartmann, Annette M; Heath, Andrew C; Heinonen, Kati; Henders, Anjali K; Homuth, Georg; Hottenga, Jouke-Jan; Iacono, William G; Janzing, Joost; Jokela, Markus; Karlsson, Robert; Kemp, John P; Kirkpatrick, Matthew G; Latvala, Antti; Lehtimäki, Terho; Liewald, David C; Madden, Pamela A F; Magri, Chiara; Magnusson, Patrik K E; Marten, Jonathan; Maschio, Andrea; Mbarek, Hamdi; Medland, Sarah E; Mihailov, Evelin; Milaneschi, Yuri; Montgomery, Grant W; Nauck, Matthias; Nivard, Michel G; Ouwens, Klaasjan G; Palotie, Aarno; Pettersson, Erik; Polasek, Ozren; Qian, Yong; Pulkki-Råback, Laura; Raitakari, Olli T; Realo, Anu; Rose, Richard J; Ruggiero, Daniela; Schmidt, Carsten O; Slutske, Wendy S; Sorice, Rossella; Starr, John M; St Pourcain, Beate; Sutin, Angelina R; Timpson, Nicholas J; Trochet, Holly; Vermeulen, Sita; Vuoksimaa, Eero; Widen, Elisabeth; Wouda, Jasper; Wright, Margaret J; Zgaga, Lina; Porteous, David; Minelli, Alessandra; Palmer, Abraham A; Rujescu, Dan; Ciullo, Marina; Hayward, Caroline; Rudan, Igor; Metspalu, Andres; Kaprio, Jaakko; Deary, Ian J; Räikkönen, Katri; Wilson, James F; Keltikangas-Järvinen, Liisa; Bierut, Laura J; Hettema, John M; Grabe, Hans J; Penninx, Brenda W J H; van Duijn, Cornelia M; Evans, David M; Schlessinger, David; Pedersen, Nancy L; Terracciano, Antonio; McGue, Matt; Martin, Nicholas G; Boomsma, Dorret I

    2016-03-01

    Extraversion is a relatively stable and heritable personality trait associated with numerous psychosocial, lifestyle and health outcomes. Despite its substantial heritability, no genetic variants have been detected in previous genome-wide association (GWA) studies, which may be due to relatively small sample sizes of those studies. Here, we report on a large meta-analysis of GWA studies for extraversion in 63,030 subjects in 29 cohorts. Extraversion item data from multiple personality inventories were harmonized across inventories and cohorts. No genome-wide significant associations were found at the single nucleotide polymorphism (SNP) level but there was one significant hit at the gene level for a long non-coding RNA site (LOC101928162). Genome-wide complex trait analysis in two large cohorts showed that the additive variance explained by common SNPs was not significantly different from zero, but polygenic risk scores, weighted using linkage information, significantly predicted extraversion scores in an independent cohort. These results show that extraversion is a highly polygenic personality trait, with an architecture possibly different from other complex human traits, including other personality traits. Future studies are required to further determine which genetic variants, by what modes of gene action, constitute the heritable nature of extraversion.

  13. Bioinformatics in high school biology curricula: a study of state science standards.

    PubMed

    Wefer, Stephen H; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students.

  14. Advancing Pharmacogenomics Education in the Core PharmD Curriculum through Student Personal Genomic Testing

    PubMed Central

    Adams, Solomon M.; Anderson, Kacey B.; Coons, James C.; Smith, Randall B.; Meyer, Susan M.; Parker, Lisa S.

    2016-01-01

    Objective. To develop, implement, and evaluate “Test2Learn” a program to enhance pharmacogenomics education through the use of personal genomic testing (PGT) and real genetic data. Design. One hundred twenty-two second-year doctor of pharmacy (PharmD) students in a required course were offered PGT as part of a larger program approach to teach pharmacogenomics within a robust ethical framework. The program added novel learning objectives, lecture materials, analysis tools, and exercises using individual-level and population-level genetic data. Outcomes were assessed with objective measures and pre/post survey instruments. Assessment. One hundred students (82%) underwent PGT. Knowledge significantly improved on multiple assessments. Genotyped students reported a greater increase in confidence in understanding test results by the end of the course. Similarly, undergoing PGT improved student’s self-perceived ability to empathize with patients compared to those not genotyped. Most students (71%) reported feeling PGT was an important part of the course, and 60% reported they had a better understanding of pharmacogenomics specifically because of the opportunity. Conclusion. Implementation of PGT in the core pharmacy curriculum was feasible, well-received, and enhanced student learning of pharmacogenomics. PMID:26941429

  15. Erosion of Conserved Binding Sites in Personal Genomes Points to Medical Histories

    PubMed Central

    Guturu, Harendra; Chinchali, Sandeep; Clarke, Shoa L.; Bejerano, Gill

    2016-01-01

    Although many human diseases have a genetic component involving many loci, the majority of studies are statistically underpowered to isolate the many contributing variants, raising the question of the existence of alternate processes to identify disease mutations. To address this question, we collect ancestral transcription factor binding sites disrupted by an individual’s variants and then look for their most significant congregation next to a group of functionally related genes. Strikingly, when the method is applied to five different full human genomes, the top enriched function for each is invariably reflective of their very different medical histories. For example, our method implicates “abnormal cardiac output” for a patient with a longstanding family history of heart disease, “decreased circulating sodium level” for an individual with hypertension, and other biologically appealing links for medical histories spanning narcolepsy to axonal neuropathy. Our results suggest that erosion of gene regulation by mutation load significantly contributes to observed heritable phenotypes that manifest in the medical history. The test we developed exposes a hitherto hidden layer of personal variants that promise to shed new light on human disease penetrance, expressivity and the sensitivity with which we can detect them. PMID:26845687

  16. Erosion of Conserved Binding Sites in Personal Genomes Points to Medical Histories.

    PubMed

    Guturu, Harendra; Chinchali, Sandeep; Clarke, Shoa L; Bejerano, Gill

    2016-02-01

    Although many human diseases have a genetic component involving many loci, the majority of studies are statistically underpowered to isolate the many contributing variants, raising the question of the existence of alternate processes to identify disease mutations. To address this question, we collect ancestral transcription factor binding sites disrupted by an individual's variants and then look for their most significant congregation next to a group of functionally related genes. Strikingly, when the method is applied to five different full human genomes, the top enriched function for each is invariably reflective of their very different medical histories. For example, our method implicates "abnormal cardiac output" for a patient with a longstanding family history of heart disease, "decreased circulating sodium level" for an individual with hypertension, and other biologically appealing links for medical histories spanning narcolepsy to axonal neuropathy. Our results suggest that erosion of gene regulation by mutation load significantly contributes to observed heritable phenotypes that manifest in the medical history. The test we developed exposes a hitherto hidden layer of personal variants that promise to shed new light on human disease penetrance, expressivity and the sensitivity with which we can detect them.

  17. Genome-wide association study of the five-factor model of personality in young Korean women.

    PubMed

    Kim, Han-Na; Roh, Seung-Ju; Sung, Yeon Ah; Chung, Hye Won; Lee, Jong-Young; Cho, Juhee; Shin, Hocheol; Kim, Hyung-Lae

    2013-10-01

    Personality is a determinant of behavior and lifestyle associated with health and human diseases. Although personality is known to be a heritable trait, its polygenic nature has made the identification of genetic variants elusive. We performed a genome-wide association study on 1089 Korean women aged 18-40 years whose personality traits were measured with the Revised NEO Personality Inventory for the five-factor model of personality. To reduce environmental factors that may influence personality traits, this study was restricted to young adult women. In the discovery phase, we identified variants of PTPRD (protein tyrosine phosphatase, receptor type D) that associated this gene with the Openness domain. Other genes that were previously reported to be associated with neurological phenotypes were also associated with personality traits. In particular, DRD1 and OR1A2 were linked to Neuroticism, NKAIN2 with Extraversion, HTR5A with Openness and DRD3 with Agreeableness. Data from our replication study of 2090 subjects confirmed the association between OR1A2 and Neuroticism. We first identified and confirmed a novel region on OR1A2 associated with Neuroticism [corrected]. Candidate genes for psychiatric disorders were also enriched. These findings contribute to our understanding of the genetic architecture of personality traits and provide critical clues to the neurobiological mechanisms that influence them.

  18. Bioinformatics for cancer immunology and immunotherapy.

    PubMed

    Charoentong, Pornpimol; Angelova, Mihaela; Efremova, Mirjana; Gallasch, Ralf; Hackl, Hubert; Galon, Jerome; Trajanoski, Zlatko

    2012-11-01

    Recent mechanistic insights obtained from preclinical studies and the approval of the first immunotherapies has motivated increasing number of academic investigators and pharmaceutical/biotech companies to further elucidate the role of immunity in tumor pathogenesis and to reconsider the role of immunotherapy. Additionally, technological advances (e.g., next-generation sequencing) are providing unprecedented opportunities to draw a comprehensive picture of the tumor genomics landscape and ultimately enable individualized treatment. However, the increasing complexity of the generated data and the plethora of bioinformatics methods and tools pose considerable challenges to both tumor immunologists and clinical oncologists. In this review, we describe current concepts and future challenges for the management and analysis of data for cancer immunology and immunotherapy. We first highlight publicly available databases with specific focus on cancer immunology including databases for somatic mutations and epitope databases. We then give an overview of the bioinformatics methods for the analysis of next-generation sequencing data (whole-genome and exome sequencing), epitope prediction tools as well as methods for integrative data analysis and network modeling. Mathematical models are powerful tools that can predict and explain important patterns in the genetic and clinical progression of cancer. Therefore, a survey of mathematical models for tumor evolution and tumor-immune cell interaction is included. Finally, we discuss future challenges for individualized immunotherapy and suggest how a combined computational/experimental approaches can lead to new insights into the molecular mechanisms of cancer, improved diagnosis, and prognosis of the disease and pinpoint novel therapeutic targets.

  19. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  20. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    ERIC Educational Resources Information Center

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  1. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  2. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  3. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    PubMed Central

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  4. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    PubMed

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  5. The Information Technology Infrastructure for the Translational Genomics Core and the Partners Biobank at Partners Personalized Medicine

    PubMed Central

    Boutin, Natalie; Holzbach, Ana; Mahanta, Lisa; Aldama, Jackie; Cerretani, Xander; Embree, Kevin; Leon, Irene; Rathi, Neeta; Vickers, Matilde

    2016-01-01

    The Biobank and Translational Genomics core at Partners Personalized Medicine requires robust software and hardware. This Information Technology (IT) infrastructure enables the storage and transfer of large amounts of data, drives efficiencies in the laboratory, maintains data integrity from the time of consent to the time that genomic data is distributed for research, and enables the management of complex genetic data. Here, we describe the functional components of the research IT infrastructure at Partners Personalized Medicine and how they integrate with existing clinical and research systems, review some of the ways in which this IT infrastructure maintains data integrity and security, and discuss some of the challenges inherent to building and maintaining such infrastructure. PMID:26805892

  6. The Information Technology Infrastructure for the Translational Genomics Core and the Partners Biobank at Partners Personalized Medicine.

    PubMed

    Boutin, Natalie; Holzbach, Ana; Mahanta, Lisa; Aldama, Jackie; Cerretani, Xander; Embree, Kevin; Leon, Irene; Rathi, Neeta; Vickers, Matilde

    2016-01-21

    The Biobank and Translational Genomics core at Partners Personalized Medicine requires robust software and hardware. This Information Technology (IT) infrastructure enables the storage and transfer of large amounts of data, drives efficiencies in the laboratory, maintains data integrity from the time of consent to the time that genomic data is distributed for research, and enables the management of complex genetic data. Here, we describe the functional components of the research IT infrastructure at Partners Personalized Medicine and how they integrate with existing clinical and research systems, review some of the ways in which this IT infrastructure maintains data integrity and security, and discuss some of the challenges inherent to building and maintaining such infrastructure.

  7. Naturally selecting solutions: the use of genetic algorithms in bioinformatics.

    PubMed

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2013-01-01

    For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.

  8. Bioinformatic approaches to interrogating vitamin D receptor signaling.

    PubMed

    Campbell, Moray J

    2017-03-10

    Bioinformatics applies unbiased approaches to develop statistically-robust insight into health and disease. At the global, or "20,000 foot" view bioinformatic analyses of vitamin D receptor (NR1I1/VDR) signaling can measure where the VDR gene or protein exerts a genome-wide significant impact on biology; VDR is significantly implicated in bone biology and immune systems, but not in cancer. With a more VDR-centric, or "2000 foot" view, bioinformatic approaches can interrogate events downstream of VDR activity. Integrative approaches can combine VDR ChIP-Seq in cell systems where significant volumes of publically available data are available. For example, VDR ChIP-Seq studies can be combined with genome-wide association studies to reveal significant associations to immune phenotypes. Similarly, VDR ChIP-Seq can be combined with data from Cancer Genome Atlas (TCGA) to infer the impact of VDR target genes in cancer progression. Therefore, bioinformatic approaches can reveal what aspects of VDR downstream networks are significantly related to disease or phenotype.

  9. REVIEW-ARTICLE Bioinformatics: an overview and its applications.

    PubMed

    Diniz, W J S; Canduri, F

    2017-03-15

    Technological advancements in recent years have promoted a marked progress in understanding the genetic basis of phenotypes. In line with these advances, genomics has changed the paradigm of biological questions in full genome-wide scale (genome-wide), revealing an explosion of data and opening up many possibilities. On the other hand, the vast amount of information that has been generated points the challenges that must be overcome for storage (Moore's law) and processing of biological information. In this context, bioinformatics and computational biology have sought to overcome such challenges. This review presents an overview of bioinformatics and its use in the analysis of biological data, exploring approaches, emerging methodologies, and tools that can give biological meaning to the data generated.

  10. Personalization.

    ERIC Educational Resources Information Center

    Shore, Rebecca Martin

    1996-01-01

    Describes how a typical high school in Huntington Beach, California, curbed disruptive student behavior by personalizing the school experience for "problem" students. Through mostly volunteer efforts, an adopt-a-kid program was initiated that matched kids' learning styles to adults' personality styles and resulted in fewer suspensions…

  11. Development of computations in bioscience and bioinformatics and its application: review of the Symposium of Computations in Bioinformatics and Bioscience (SCBB06).

    PubMed

    Deng, Youping; Ni, Jun; Zhang, Chaoyang

    2006-12-12

    The first symposium of computations in bioinformatics and bioscience (SCBB06) was held in Hangzhou, China on June 21-22, 2006. Twenty-six peer-reviewed papers were selected for publication in this special issue of BMC Bioinformatics. These papers cover a broad range of topics including bioinformatics theories, algorithms, applications and tool development. The main technical topics contain gene expression analysis, sequence analysis, genome analysis, phylogenetic analysis, gene function prediction, molecular interaction and system biology, genetics and population study, immune strategy, protein structure prediction and proteomics.

  12. Translational Bioinformatics Approaches to Drug Development

    PubMed Central

    Readhead, Ben; Dudley, Joel

    2013-01-01

    Significance A majority of therapeutic interventions occur late in the pathological process, when treatment outcome can be less predictable and effective, highlighting the need for new precise and preventive therapeutic development strategies that consider genomic and environmental context. Translational bioinformatics is well positioned to contribute to the many challenges inherent in bridging this gap between our current reactive methods of healthcare delivery and the intent of precision medicine, particularly in the areas of drug development, which forms the focus of this review. Recent Advances A variety of powerful informatics methods for organizing and leveraging the vast wealth of available molecular measurements available for a broad range of disease contexts have recently emerged. These include methods for data driven disease classification, drug repositioning, identification of disease biomarkers, and the creation of disease network models, each with significant impacts on drug development approaches. Critical Issues An important bottleneck in the application of bioinformatics methods in translational research is the lack of investigators who are versed in both biomedical domains and informatics. Efforts to nurture both sets of competencies within individuals and to increase interfield visibility will help to accelerate the adoption and increased application of bioinformatics in translational research. Future Directions It is possible to construct predictive, multiscale network models of disease by integrating genotype, gene expression, clinical traits, and other multiscale measures using causal network inference methods. This can enable the identification of the “key drivers” of pathology, which may represent novel therapeutic targets or biomarker candidates that play a more direct role in the etiology of disease. PMID:24527359

  13. Bioinformatics Education—Perspectives and Challenges out of Africa

    PubMed Central

    Adebiyi, Ezekiel F.; Alzohairy, Ahmed M.; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J.; Panji, Sumir; Patterton, Hugh-G.

    2015-01-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  14. Motivations, concerns and preferences of personal genome sequencing research participants: Baseline findings from the HealthSeq project

    PubMed Central

    Sanderson, Saskia C; Linderman, Michael D; Suckiel, Sabrina A; Diaz, George A; Zinberg, Randi E; Ferryman, Kadija; Wasserstein, Melissa; Kasarskis, Andrew; Schadt, Eric E

    2016-01-01

    Whole exome/genome sequencing (WES/WGS) is increasingly offered to ostensibly healthy individuals. Understanding the motivations and concerns of research participants seeking out personal WGS and their preferences regarding return-of-results and data sharing will help optimize protocols for WES/WGS. Baseline interviews including both qualitative and quantitative components were conducted with research participants (n=35) in the HealthSeq project, a longitudinal cohort study of individuals receiving personal WGS results. Data sharing preferences were recorded during informed consent. In the qualitative interview component, the dominant motivations that emerged were obtaining personal disease risk information, satisfying curiosity, contributing to research, self-exploration and interest in ancestry, and the dominant concern was the potential psychological impact of the results. In the quantitative component, 57% endorsed concerns about privacy. Most wanted to receive all personal WGS results (94%) and their raw data (89%); a third (37%) consented to having their data shared to the Database of Genotypes and Phenotypes (dbGaP). Early adopters of personal WGS in the HealthSeq project express a variety of health- and non-health-related motivations. Almost all want all available findings, while also expressing concerns about the psychological impact and privacy of their results. PMID:26036856

  15. Motivations, concerns and preferences of personal genome sequencing research participants: Baseline findings from the HealthSeq project.

    PubMed

    Sanderson, Saskia C; Linderman, Michael D; Suckiel, Sabrina A; Diaz, George A; Zinberg, Randi E; Ferryman, Kadija; Wasserstein, Melissa; Kasarskis, Andrew; Schadt, Eric E

    2016-01-01

    Whole exome/genome sequencing (WES/WGS) is increasingly offered to ostensibly healthy individuals. Understanding the motivations and concerns of research participants seeking out personal WGS and their preferences regarding return-of-results and data sharing will help optimize protocols for WES/WGS. Baseline interviews including both qualitative and quantitative components were conducted with research participants (n=35) in the HealthSeq project, a longitudinal cohort study of individuals receiving personal WGS results. Data sharing preferences were recorded during informed consent. In the qualitative interview component, the dominant motivations that emerged were obtaining personal disease risk information, satisfying curiosity, contributing to research, self-exploration and interest in ancestry, and the dominant concern was the potential psychological impact of the results. In the quantitative component, 57% endorsed concerns about privacy. Most wanted to receive all personal WGS results (94%) and their raw data (89%); a third (37%) consented to having their data shared to the Database of Genotypes and Phenotypes (dbGaP). Early adopters of personal WGS in the HealthSeq project express a variety of health- and non-health-related motivations. Almost all want all available findings, while also expressing concerns about the psychological impact and privacy of their results.

  16. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    PubMed

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA.

  17. Personalized Medicine and Genomics: Challenges and Opportunities in Assessing Effectiveness, Cost-Effectiveness, and Future Research Priorities

    PubMed Central

    Conti, Rena; Veenstra, David L.; Armstrong, Katrina; Lesko, Lawrence J.; Grosse, Scott D.

    2015-01-01

    Personalized medicine is health care that tailors interventions to individual variation in risk and treatment response. Although medicine has long strived to achieve this goal, advances in genomics promise to facilitate this process. Relevant to present-day practice is the use of genomic information to classify individuals according to disease susceptibility or expected responsiveness to a pharmacologic treatment and to provide targeted interventions. A symposium at the annual meeting of the Society for Medical Decision Making on 23 October 2007 highlighted the challenges and opportunities posed in translating advances in molecular medicine into clinical practice. A panel of US experts in medical practice, regulatory policy, technology assessment, and the financing and organization of medical innovation was asked to discuss the current state of practice and research on personalized medicine as it relates to their own field. This article reports on the issues raised, discusses potential approaches to meet these challenges, and proposes directions for future work. The case of genetic testing to inform dosing with warfarin, an anticoagulant, is used to illustrate differing perspectives on evidence and decision making for personalized medicine. PMID:20086232

  18. Group-based and personalized care in an age of genomic and evidence-based medicine: a reappraisal.

    PubMed

    Maglo, Koffi N

    2012-01-01

    This article addresses the philosophical and moral foundations of group-based and individualized therapy in connection with population care equality. The U.S. Food and Drug Administration (FDA) recently modified its public health policy by seeking to enhance the efficacy and equality of care through the approval of group-specific prescriptions and doses for some drugs. In the age of genomics, when individualization of care increasingly has become a major concern, investigating the relationship between population health, stratified medicine, and personalized therapy can improve our understanding of the ethical and biomedical implications of genomic medicine. I suggest that the need to optimize population health through population substructure-sensitive research and the need to individualize care through genetically targeted therapies are not necessarily incompatible. Accordingly, the article reconceptualizes a unified goal for modern scientific medicine in terms of individualized equal care.

  19. Balancing Benefits and Risks of Immortal Data: Participants’ Views of Open Consent in the Personal Genome Project

    PubMed Central

    Zarate, Oscar A.; Brody, Julia Green; Brown, Phil; Ramírez-Andreotta, Mónica D.; Perovich, Laura; Matz, Jacob

    2016-01-01

    The NIH Genomic Data Sharing Policy, effective in January 2015, encourages researchers to obtain broad consent to share data for unspecified biomedical research. The ethics of extensive data sharing depend in part on study participants’ understanding of the risks and benefits. Interviews with participants in the Personal Genome Project show that study participants can readily discuss the risks, including loss of privacy, and are willing to accept risks because they value the opportunity to contribute to health science. They have expansive views of the benefits for science, medicine, and their own health and curiosity. With justice in mind, further exploration is needed to evaluate consent for data sharing among more diverse and vulnerable populations. PMID:26678513

  20. Bioinformatics by Example: From Sequence to Target

    NASA Astrophysics Data System (ADS)

    Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

    2002-12-01

    With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.

  1. Rapid Bioinformatic Identification of Thermostabilizing Mutations

    PubMed Central

    Sauer, David B.; Karpowich, Nathan K.; Song, Jin Mei; Wang, Da-Neng

    2015-01-01

    Ex vivo stability is a valuable protein characteristic but is laborious to improve experimentally. In addition to biopharmaceutical and industrial applications, stable protein is important for biochemical and structural studies. Taking advantage of the large number of available genomic sequences and growth temperature data, we present two bioinformatic methods to identify a limited set of amino acids or positions that likely underlie thermostability. Because these methods allow thousands of homologs to be examined in silico, they have the advantage of providing both speed and statistical power. Using these methods, we introduced, via mutation, amino acids from thermoadapted homologs into an exemplar mesophilic membrane protein, and demonstrated significantly increased thermostability while preserving protein activity. PMID:26445442

  2. Evolving Strategies for the Incorporation of Bioinformatics Within the Undergraduate Cell Biology Curriculum

    PubMed Central

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489

  3. Bioinformatics in microbial biotechnology--a mini review.

    PubMed

    Bansal, Arvind K

    2005-06-28

    The revolutionary growth in the computation speed and memory storage capability has fueled a new era in the analysis of biological data. Hundreds of microbial genomes and many eukaryotic genomes including a cleaner draft of human genome have been sequenced raising the expectation of better control of microorganisms. The goals are as lofty as the development of rational drugs and antimicrobial agents, development of new enhanced bacterial strains for bioremediation and pollution control, development of better and easy to administer vaccines, the development of protein biomarkers for various bacterial diseases, and better understanding of host-bacteria interaction to prevent bacterial infections. In the last decade the development of many new bioinformatics techniques and integrated databases has facilitated the realization of these goals. Current research in bioinformatics can be classified into: (i) genomics--sequencing and comparative study of genomes to identify gene and genome functionality, (ii) proteomics--identification and characterization of protein related properties and reconstruction of metabolic and regulatory pathways, (iii) cell visualization and simulation to study and model cell behavior, and (iv) application to the development of drugs and anti-microbial agents. In this article, we will focus on the techniques and their limitations in genomics and proteomics. Bioinformatics research can be classified under three major approaches: (1) analysis based upon the available experimental wet-lab data, (2) the use of mathematical modeling to derive new information, and (3) an integrated approach that integrates search techniques with mathematical modeling. The major impact of bioinformatics research has been to automate the genome sequencing, automated development of integrated genomics and proteomics databases, automated genome comparisons to identify the genome function, automated derivation of metabolic pathways, gene expression analysis to derive

  4. Impact on schizotypal personality trait of a genome-wide supported psychosis variant of the ZNF804A gene.

    PubMed

    Yasuda, Yuka; Hashimoto, Ryota; Ohi, Kazutaka; Fukumoto, Motoyuki; Umeda-Yano, Satomi; Yamamori, Hidenaga; Okochi, Tomo; Iwase, Masao; Kazui, Hiroaki; Iwata, Nakao; Takeda, Masatoshi

    2011-05-20

    Schizophrenia is a complex disorder with a high heritability. Relatives with schizophrenia have an increased risk not only for schizophrenia but also for schizophrenia spectrum disorders, such as schizotypal personality disorder. A single nucleotide polymorphism (SNP), rs1344706, in the Zinc Finger Protein 804A (ZNF804A) gene, has been implicated in susceptibility to schizophrenia by several genome-wide association studies, follow-up association studies and meta-analyses. This SNP has been shown to affect neuronal connectivities and cognitive abilities. We investigated an association between the ZNF804A genotype of rs1344706 and schizotypal personality traits using the Schizotypal Personality Questionnaire (SPQ) in 176 healthy subjects. We also looked for specific associations among ZNF804A polymorphisms and the three factors of schizotypy-cognitive/perceptual, interpersonal and disorganization-assessed by the SPQ. The total score for the SPQ in carriers of the risk T allele was significantly higher than that in individuals with the G/G genotype (p=0.042). For the three factors derived from the SPQ, carriers with the risk T allele showed a higher disorganization factor (p=0.011), but there were no differences in the cognitive/perceptual or interpersonal factors between genotype groups (p>0.30). These results suggest that the genetic variation in ZNF804A might increase susceptibility not only for schizophrenia but also for schizotypal personality traits in healthy subjects.

  5. Reproducible Bioinformatics Research for Biologists

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  6. Bioinformatics and the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  7. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  8. Clinical Bioinformatics: challenges and opportunities

    PubMed Central

    2012-01-01

    Background Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of -omics information. Conclusions Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput "-omics" technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research. PMID:23095472

  9. Computational intelligence techniques in bioinformatics.

    PubMed

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included.

  10. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.

    PubMed

    Mulder, Nicola J; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-02-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet.

  11. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    PubMed Central

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  12. Bioinformatics and the allergy assessment of agricultural biotechnology products: industry practices and recommendations.

    PubMed

    Ladics, Gregory S; Cressman, Robert F; Herouet-Guicheney, Corinne; Herman, Rod A; Privalle, Laura; Song, Ping; Ward, Jason M; McClain, Scott

    2011-06-01

    Bioinformatic tools are being increasingly utilized to evaluate the degree of similarity between a novel protein and known allergens within the context of a larger allergy safety assessment process. Importantly, bioinformatics is not a predictive analysis that can determine if a novel protein will ''become" an allergen, but rather a tool to assess whether the protein is a known allergen or is potentially cross-reactive with an existing allergen. Bioinformatic tools are key components of the 2009 CodexAlimentarius Commission's weight-of-evidence approach, which encompasses a variety of experimental approaches for an overall assessment of the allergenic potential of a novel protein. Bioinformatic search comparisons between novel protein sequences, as well as potential novel fusion sequences derived from the genome and transgene, and known allergens are required by all regulatory agencies that assess the safety of genetically modified (GM) products. The objective of this paper is to identify opportunities for consensus in the methods of applying bioinformatics and to outline differences that impact a consistent and reliable allergy safety assessment. The bioinformatic comparison process has some critical features, which are outlined in this paper. One of them is a curated, publicly available and well-managed database with known allergenic sequences. In this paper, the best practices, scientific value, and food safety implications of bioinformatic analyses, as they are applied to GM food crops are discussed. Recommendations for conducting bioinformatic analysis on novel food proteins for potential cross-reactivity to known allergens are also put forth.

  13. Prioritization of anticancer drugs against a cancer using genomic features of cancer cells: A step towards personalized medicine

    PubMed Central

    Gupta, Sudheer; Chaudhary, Kumardeep; Kumar, Rahul; Gautam, Ankur; Nanda, Jagpreet Singh; Dhanda, Sandeep Kumar; Brahmachari, Samir Kumar; Raghava, Gajendra P. S.

    2016-01-01

    In this study, we investigated drug profile of 24 anticancer drugs tested against a large number of cell lines in order to understand the relation between drug resistance and altered genomic features of a cancer cell line. We detected frequent mutations, high expression and high copy number variations of certain genes in both drug resistant cell lines and sensitive cell lines. It was observed that a few drugs, like Panobinostat, are effective against almost all types of cell lines, whereas certain drugs are effective against only a limited type of cell lines. Tissue-specific preference of drugs was also seen where a drug is more effective against cell lines belonging to a specific tissue. Genomic features based models have been developed for each anticancer drug and achieved average correlation between predicted and actual growth inhibition of cell lines in the range of 0.43 to 0.78. We hope, our study will throw light in the field of personalized medicine, particularly in designing patient-specific anticancer drugs. In order to serve the scientific community, a webserver, CancerDP, has been developed for predicting priority/potency of an anticancer drug against a cancer cell line using its genomic features (http://crdd.osdd.net/raghava/cancerdp/). PMID:27030518

  14. Robust enzyme design: bioinformatic tools for improved protein stability.

    PubMed

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.

  15. Systems Biology: The Next Frontier for Bioinformatics

    PubMed Central

    Likić, Vladimir A.; McConville, Malcolm J.; Lithgow, Trevor; Bacic, Antony

    2010-01-01

    Biochemical systems biology augments more traditional disciplines, such as genomics, biochemistry and molecular biology, by championing (i) mathematical and computational modeling; (ii) the application of traditional engineering practices in the analysis of biochemical systems; and in the past decade increasingly (iii) the use of near-comprehensive data sets derived from ‘omics platform technologies, in particular “downstream” technologies relative to genome sequencing, including transcriptomics, proteomics and metabolomics. The future progress in understanding biological principles will increasingly depend on the development of temporal and spatial analytical techniques that will provide high-resolution data for systems analyses. To date, particularly successful were strategies involving (a) quantitative measurements of cellular components at the mRNA, protein and metabolite levels, as well as in vivo metabolic reaction rates, (b) development of mathematical models that integrate biochemical knowledge with the information generated by high-throughput experiments, and (c) applications to microbial organisms. The inevitable role bioinformatics plays in modern systems biology puts mathematical and computational sciences as an equal partner to analytical and experimental biology. Furthermore, mathematical and computational models are expected to become increasingly prevalent representations of our knowledge about specific biochemical systems. PMID:21331364

  16. A Bioinformatics Facility for NASA

    NASA Technical Reports Server (NTRS)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  17. CAN I ACCESS MY PERSONAL GENOME? THE CURRENT LEGAL POSITION IN THE UK

    PubMed Central

    Kaye, Jane; Kanellopoulou, Nadja; Hawkins, Naomi; Gowans, Heather; Curren, Liam; Melham, Karen

    2014-01-01

    This paper discusses the nature of genomic information, and the moral arguments in support of an individual's right to access it. It analyses the legal avenues an individual might take to access their sequence information. The authors describe the policy implications in this area and conclude that, for now, the law appears to strike an appropriate balance, but new policy will need to be developed to address this issue. PMID:24136352

  18. Personalized Oncogenomics: Clinical Experience with Malignant Peritoneal Mesothelioma Using Whole Genome Sequencing

    PubMed Central

    Sheffield, Brandon S.; Tinker, Anna V.; Shen, Yaoqing; Hwang, Harry; Li-Chang, Hector H.; Pleasance, Erin; Ch’ng, Carolyn; Lum, Amy; Lorette, Julie; McConnell, Yarrow J.; Sun, Sophie; Jones, Steven J. M.; Gown, Allen M.; Huntsman, David G.; Schaeffer, David F.; Churg, Andrew; Yip, Stephen; Laskin, Janessa; Marra, Marco A.

    2015-01-01

    Peritoneal mesothelioma is a rare and sometimes lethal malignancy that presents a clinical challenge for both diagnosis and management. Recent studies have led to a better understanding of the molecular biology of peritoneal mesothelioma. Translation of the emerging data into better treatments and outcome is needed. From two patients with peritoneal mesothelioma, we derived whole genome sequences, RNA expression profiles, and targeted deep sequencing data. Molecular data were made available for translation into a clinical treatment plan. Treatment responses and outcomes were later examined in the context of molecular findings. Molecular studies presented here provide the first reported whole genome sequences of peritoneal mesothelioma. Mutations in known mesothelioma-related genes NF2, CDKN2A, LATS2, amongst others, were identified. Activation of MET-related signaling pathways was demonstrated in both cases. A hypermutated phenotype was observed in one case (434 vs. 18 single nucleotide variants) and was associated with a favourable outcome despite sarcomatoid histology and multifocal disease. This study represents the first report of whole genome analyses of peritoneal mesothelioma, a key step in the understanding and treatment of this disease. PMID:25798586

  19. Protein bioinformatics applied to virology.

    PubMed

    Mohabatkar, Hassan; Keyhanfar, Mehrnaz; Behbahani, Mandana

    2012-09-01

    Scientists have united in a common search to sequence, store and analyze genes and proteins. In this regard, rapidly evolving bioinformatics methods are providing valuable information on these newly-discovered molecules. Understanding what has been done and what we can do in silico is essential in designing new experiments. The unbalanced situation between sequence-known proteins and attribute-known proteins, has called for developing computational methods or high-throughput automated tools for fast and reliably predicting or identifying various characteristics of uncharacterized proteins. Taking into consideration the role of viruses in causing diseases and their use in biotechnology, the present review describes the application of protein bioinformatics in virology. Therefore, a number of important features of viral proteins like epitope prediction, protein docking, subcellular localization, viral protease cleavage sites and computer based comparison of their aspects have been discussed. This paper also describes several tools, principally developed for viral bioinformatics. Prediction of viral protein features and learning the advances in this field can help basic understanding of the relationship between a virus and its host.

  20. In the loop: promoter–enhancer interactions and bioinformatics

    PubMed Central

    Mora, Antonio; Sandve, Geir Kjetil; Gabrielsen, Odd Stokke

    2016-01-01

    Enhancer–promoter regulation is a fundamental mechanism underlying differential transcriptional regulation. Spatial chromatin organization brings remote enhancers in contact with target promoters in cis to regulate gene expression. There is considerable evidence for promoter–enhancer interactions (PEIs). In the recent years, genome-wide analyses have identified signatures and mapped novel enhancers; however, being able to precisely identify their target gene(s) requires massive biological and bioinformatics efforts. In this review, we give a short overview of the chromatin landscape and transcriptional regulation. We discuss some key concepts and problems related to chromatin interaction detection technologies, and emerging knowledge from genome-wide chromatin interaction data sets. Then, we critically review different types of bioinformatics analysis methods and tools related to representation and visualization of PEI data, raw data processing and PEI prediction. Lastly, we provide specific examples of how PEIs have been used to elucidate a functional role of non-coding single-nucleotide polymorphisms. The topic is at the forefront of epigenetic research, and by highlighting some future bioinformatics challenges in the field, this review provides a comprehensive background for future PEI studies. PMID:26586731

  1. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    NASA Astrophysics Data System (ADS)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  2. Balancing Benefits and Risks of Immortal Data: Participants' Views of Open Consent in the Personal Genome Project.

    PubMed

    Zarate, Oscar A; Brody, Julia Green; Brown, Phil; Ramirez-Andreotta, Mónica D; Perovich, Laura; Matz, Jacob

    2016-01-01

    An individual's health, genetic, or environmental-exposure data, placed in an online repository, creates a valuable shared resource that can accelerate biomedical research and even open opportunities for crowd-sourcing discoveries by members of the public. But these data become "immortalized" in ways that may create lasting risk as well as benefit. Once shared on the Internet, the data are difficult or impossible to redact, and identities may be revealed by a process called data linkage, in which online data sets are matched to each other. Reidentification (re-ID), the process of associating an individual's name with data that were considered deidentified, poses risks such as insurance or employment discrimination, social stigma, and breach of the promises often made in informed-consent documents. At the same time, re-ID poses risks to researchers and indeed to the future of science, should re-ID end up undermining the trust and participation of potential research participants. The ethical challenges of online data sharing are heightened as so-called big data becomes an increasingly important research tool and driver of new research structures. Big data is shifting research to include large numbers of researchers and institutions as well as large numbers of participants providing diverse types of data, so the participants' consent relationship is no longer with a person or even a research institution. In addition, consent is further transformed because big data analysis often begins with descriptive inquiry and generation of a hypothesis, and the research questions cannot be clearly defined at the outset and may be unforeseeable over the long term. In this article, we consider how expanded data sharing poses new challenges, illustrated by genomics and the transition to new models of consent. We draw on the experiences of participants in an open data platform-the Personal Genome Project-to allow study participants to contribute their voices to inform ethical consent

  3. BioWarehouse: a bioinformatics database warehouse toolkit

    PubMed Central

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

    2006-01-01

    Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for

  4. A survey on evolutionary algorithm based hybrid intelligence in bioinformatics.

    PubMed

    Li, Shan; Kang, Liying; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks.

  5. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    PubMed Central

    Li, Shan; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks. PMID:24729969

  6. Use or abuse of bioinformatic tools: a response to Samach.

    PubMed

    Muñoz-Fambuena, Natalia; Mesejo, Carlos; González-Mas, María C; Primo-Millo, Eduardo; Agustí, Manuel; Iglesias, Domingo J

    2013-03-01

    In a recent paper, we described for the first time the effects of fruit on the expression of putative homologues of genes involved in flowering pathways. It was our aim to provide insight into the molecular mechanisms underlying alternate bearing in citrus. However, a bioinformatics-based critique of our and other related papers has been given by Samach in the preceding Viewpoint article in this issue of Annals of Botany. The use of certain bioinformatic tools in a context of structural rather than functional genomics can cast doubts about the veracity of a large amount of data published in recent years. In this response, the contentions raised by Samach are analysed, and rebuttals of his criticisms are presented.

  7. Challenges and disparities in the application of personalized genomic medicine to populations with African ancestry

    PubMed Central

    Kessler, Michael D.; Yerges-Armstrong, Laura; Taub, Margaret A.; Shetty, Amol C.; Maloney, Kristin; Jeng, Linda Jo Bone; Ruczinski, Ingo; Levin, Albert M.; Williams, L. Keoki; Beaty, Terri H.; Mathias, Rasika A.; Barnes, Kathleen C.; Boorgula, Meher Preethi; Campbell, Monica; Chavan, Sameer; Ford, Jean G.; Foster, Cassandra; Gao, Li; Hansel, Nadia N.; Horowitz, Edward; Huang, Lili; Ortiz, Romina; Potee, Joseph; Rafaels, Nicholas; Scott, Alan F.; Vergara, Candelaria; Gao, Jingjing; Hu, Yijuan; Johnston, Henry Richard; Qin, Zhaohui S.; Padhukasahasram, Badri; Dunston, Georgia M.; Faruque, Mezbah U.; Kenny, Eimear E.; Gietzen, Kimberly; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Deshpande, Aniket; Grus, Wendy E.; Locke, Devin P.; Foreman, Marilyn G.; Avila, Pedro C.; Grammer, Leslie; Kim, Kwang-YounA; Kumar, Rajesh; Schleimer, Robert; Bustamante, Carlos; De La Vega, Francisco M.; Gignoux, Chris R.; Shringarpure, Suyash S.; Musharoff, Shaila; Wojcik, Genevieve; Burchard, Esteban G.; Eng, Celeste; Gourraud, Pierre-Antoine; Hernandez, Ryan D.; Lizee, Antoine; Pino-Yanes, Maria; Torgerson, Dara G.; Szpiech, Zachary A.; Torres, Raul; Nicolae, Dan L.; Ober, Carole; Olopade, Christopher O.; Olopade, Olufunmilayo; Oluwole, Oluwafemi; Arinola, Ganiyu; Song, Wei; Abecasis, Goncalo; Correa, Adolfo; Musani, Solomon; Wilson, James G.; Lange, Leslie A.; Akey, Joshua; Bamshad, Michael; Chong, Jessica; Fu, Wenqing; Nickerson, Deborah; Reiner, Alexander; Hartert, Tina; Ware, Lorraine B.; Bleecker, Eugene; Meyers, Deborah; Ortega, Victor E.; Pissamai, Maul R. N.; Trevor, Maul R. N.; Watson, Harold; Araujo, Maria Ilma; Oliveira, Ricardo Riccio; Caraballo, Luis; Marrugo, Javier; Martinez, Beatriz; Meza, Catherine; Ayestas, Gerardo; Herrera-Paz, Edwin Francisco; Landaverde-Torres, Pamela; Erazo, Said Omar Leiva; Martinez, Rosella; Mayorga, Alvaro; Mayorga, Luis F.; Mejia-Mejia, Delmy-Aracely; Ramos, Hector; Saenz, Allan; Varela, Gloria; Vasquez, Olga Marina; Ferguson, Trevor; Knight-Madden, Jennifer; Samms-Vaughan, Maureen; Wilks, Rainford J.; Adegnika, Akim; Ateba-Ngoa, Ulysse; Yazdanbakhsh, Maria; O'Connor, Timothy D.

    2016-01-01

    To characterize the extent and impact of ancestry-related biases in precision genomic medicine, we use 642 whole-genome sequences from the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) project to evaluate typical filters and databases. We find significant correlations between estimated African ancestry proportions and the number of variants per individual in all variant classification sets but one. The source of these correlations is highlighted in more detail by looking at the interaction between filtering criteria and the ClinVar and Human Gene Mutation databases. ClinVar's correlation, representing African ancestry-related bias, has changed over time amidst monthly updates, with the most extreme switch happening between March and April of 2014 (r=0.733 to r=−0.683). We identify 68 SNPs as the major drivers of this change in correlation. As long as ancestry-related bias when using these clinical databases is minimally recognized, the genetics community will face challenges with implementation, interpretation and cost-effectiveness when treating minority populations. PMID:27725664

  8. Automation of Bioinformatics Workflows using CloVR, a Cloud Virtual Resource

    PubMed Central

    Vangala, Mahesh

    2013-01-01

    Exponential growth of biological data, mainly due to revolutionary developments in NGS technologies in past couple of years, created a multitude of challenges in downstream data analysis using bioinformatics approaches. To handle such tsunami of data, bioinformatics analysis must be carried out in an automated and parallel fashion. A successful analysis often requires more than a few computational steps and bootstrapping these individual steps (scripts) into components and the components into pipelines certainly makes bioinformatics a reproducible and manageable segment of scientific research. CloVR (http://clovr.org) is one such flexible framework that facilitates the abstraction of bioinformatics workflows into executable pipelines. CloVR comes packaged with various built-in bioinformatics pipelines that can make use of multicore processing power when run on servers and/or cloud. CloVR is amenable to build custom pipelines based on individual laboratory requirements. CloVR is available as a single executable virtual image file that comes bundled with pre-installed and pre-configured bioinformatics tools and packages and thus circumvents the cumbersome installation difficulties. CloVR is highly portable and can be run on traditional desktop/laptop computers, central servers and cloud compute farms. In conclusion, CloVR provides built-in automated analysis pipelines for microbial genomics with a scope to develop and integrate custom-workflows that make use of parallel processing power when run on compute clusters, there by addressing the bioinformatics challenges with NGS data.

  9. UCSC genome browser tutorial.

    PubMed

    Zweig, Ann S; Karolchik, Donna; Kuhn, Robert M; Haussler, David; Kent, W James

    2008-08-01

    The University of California Santa Cruz (UCSC) Genome Bioinformatics website consists of a suite of free, open-source, on-line tools that can be used to browse, analyze, and query genomic data. These tools are available to anyone who has an Internet browser and an interest in genomics. The website provides a quick and easy-to-use visual display of genomic data. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. Many of the annotation tracks are submitted by scientists worldwide; the others are computed by the UCSC Genome Bioinformatics group from publicly available sequence data. It also allows users to upload and display their own experimental results or annotation sets by creating a custom track. The suite of tools, downloadable data files, and links to documentation and other information can be found at http://genome.ucsc.edu/.

  10. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers.

    PubMed

    Schneider, Maria V; Walter, Peter; Blatter, Marie-Claude; Watson, James; Brazas, Michelle D; Rother, Kristian; Budd, Aidan; Via, Allegra; van Gelder, Celia W G; Jacob, Joachim; Fernandes, Pedro; Nyrönen, Tommi H; De Las Rivas, Javier; Blicher, Thomas; Jimenez, Rafael C; Loveland, Jane; McDowall, Jennifer; Jones, Phil; Vaughan, Brendan W; Lopez, Rodrigo; Attwood, Teresa K; Brooksbank, Catherine

    2012-05-01

    Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response to the development of 'high-throughput biology', the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials. Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review it, respectively, to similar initiatives and collections.

  11. The Roots of Bioinformatics in Theoretical Biology

    PubMed Central

    Hogeweg, Paulien

    2011-01-01

    From the late 1980s onward, the term “bioinformatics” mostly has been used to refer to computational methods for comparative analysis of genome data. However, the term was originally more widely defined as the study of informatic processes in biotic systems. In this essay, I will trace this early history (from a personal point of view) and I will argue that the original meaning of the term is re-emerging. PMID:21483479

  12. Bioinformatics in Africa: The Rise of Ghana?

    PubMed Central

    Karikari, Thomas K.

    2015-01-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  13. Emerging strengths in Asia Pacific bioinformatics

    PubMed Central

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-01-01

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20–23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts. PMID:19091008

  14. [Bioinformatics: a key role in oncology].

    PubMed

    Olivier, Timothée; Chappuis, Pierre; Tsantoulis, Petros

    2016-05-18

    Bioinformatics is essential in clinical oncology and research. Combining biology, computer science and mathematics, bioinformatics aims to derive useful information from clinical and biological data, often poorly structured, at a large scale. Bioinformatics approaches have reclassified certain cancers based on their molecular and biological presentation, improving treatment selection. Many molecular signatures have been developed and, after validation, some are now usable in clinical practice. Other applications could facilitate daily practice, reduce the risk of error and increase the precision of medical decision-making. Bioinformatics must evolve in accordance with ethical considerations and requires multidisciplinary collaboration. Its application depends on a sound technical foundation that meets strict quality requirements.

  15. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  16. Impacts of bioinformatics to medicinal chemistry.

    PubMed

    Chou, Kuo-Chen

    2015-01-01

    Facing the explosive growth of biological sequence data, such as those of protein/peptide and DNA/RNA, generated in the post-genomic age, many bioinformatical and mathematical approaches as well as physicochemical concepts have been introduced to timely derive useful informations from these biological sequences, in order to stimulate the development of medical science and drug design. Meanwhile, because of the rapid penetrations from these disciplines, medicinal chemistry is currently undergoing an unprecedented revolution. In this minireview, we are to summarize the progresses by focusing on the following six aspects. (1) Use the pseudo amino acid composition or PseAAC to predict various attributes of protein/peptide sequences that are useful for drug development. (2) Use pseudo oligonucleotide composition or PseKNC to do the same for DNA/RNA sequences. (3) Introduce the multi-label approach to study those systems where the constituent elements bear multiple characters and functions. (4) Utilize the graphical rules and "wenxiang" diagrams to analyze complicated biomedical systems. (5) Recent development in identifying the interactions of drugs with its various types of target proteins in cellular networking. (6) Distorted key theory and its application in developing peptide drugs.

  17. Bioinformatics in the secondary science classroom: A study of state content standards and students' perceptions of, and performance in, bioinformatics lessons

    NASA Astrophysics Data System (ADS)

    Wefer, Stephen H.

    The proliferation of bioinformatics in modern Biology marks a new revolution in science, which promises to influence science education at all levels. This thesis examined state standards for content that articulated bioinformatics, and explored secondary students' affective and cognitive perceptions of, and performance in, a bioinformatics mini-unit. The results are presented as three studies. The first study analyzed secondary science standards of 49 U.S States (Iowa has no science framework) and the District of Columbia for content related to bioinformatics at the introductory high school biology level. The bionformatics content of each state's Biology standards were categorized into nine areas and the prevalence of each area documented. The nine areas were: The Human Genome Project, Forensics, Evolution, Classification, Nucleotide Variations, Medicine, Computer Use, Agriculture/Food Technology, and Science Technology and Society/Socioscientific Issues (STS/SSI). Findings indicated a generally low representation of bioinformatics related content, which varied substantially across the different areas. Recommendations are made for reworking existing standards to incorporate bioinformatics and to facilitate the goal of promoting science literacy in this emerging new field among secondary school students. The second study examined thirty-two students' affective responses to, and content mastery of, a two-week bioinformatics mini-unit. The findings indicate that the students generally were positive relative to their interest level, the usefulness of the lessons, the difficulty level of the lessons, likeliness to engage in additional bioinformatics, and were overall successful on the assessments. A discussion of the results and significance is followed by suggestions for future research and implementation for transferability. The third study presents a case study of individual differences among ten secondary school students, whose cognitive and affective percepts were

  18. Virulence factor activity relationships (VFARs): a bioinformatics perspective.

    PubMed

    Waseem, Hassan; Williams, Maggie R; Stedtfeld, Tiffany; Chai, Benli; Stedtfeld, Robert D; Cole, James R; Tiedje, James M; Hashsham, Syed A

    2017-03-06

    Virulence factor activity relationships (VFARs) - a concept loosely based on quantitative structure-activity relationships (QSARs) for chemicals was proposed as a predictive tool for ranking risks due to microorganisms relevant to water safety. A rapid increase in sequencing capabilities and bioinformatics tools has significantly increased the potential for VFAR-based analyses. This review summarizes more than 20 bioinformatics databases and tools, developed over the last decade, along with their virulence and antimicrobial resistance prediction capabilities. With the number of bacterial whole genome sequences exceeding 241 000 and metagenomic analysis projects exceeding 13 000 and the ability to add additional genome sequences for few hundred dollars, it is evident that further development of VFARs is not limited by the availability of information at least at the genomic level. However, additional information related to co-occurrence, treatment response, modulation of virulence due to environmental and other factors, and economic impact must be gathered and incorporated in a manner that also addresses the associated uncertainties. Of the bioinformatics tools, a majority are either designed exclusively for virulence/resistance determination or equipped with a dedicated module. The remaining have the potential to be employed for evaluating virulence. This review focusing broadly on omics technologies and tools supports the notion that these tools are now sufficiently developed to allow the application of VFAR approaches combined with additional engineering and economic analyses to rank and prioritize organisms important to a given niche. Knowledge gaps do exist but can be filled with focused experimental and theoretical analyses that were unimaginable a decade ago. Further developments should consider the integration of the measurement of activity, risk, and uncertainty to improve the current capabilities.

  19. Trends in Next-Generation Sequencing and a New Era for Whole Genome Sequencing.

    PubMed

    Park, Sang Tae; Kim, Jayoung

    2016-11-01

    This article is a mini-review that provides a general overview for next-generation sequencing (NGS) and introduces one of the most popular NGS applications, whole genome sequencing (WGS), developed from the expansion of human genomics. NGS technology has brought massively high throughput sequencing data to bear on research questions, enabling a new era of genomic research. Development of bioinformatic software for NGS has provided more opportunities for researchers to use various applications in genomic fields. De novo genome assembly and large scale DNA resequencing to understand genomic variations are popular genomic research tools for processing a tremendous amount of data at low cost. Studies on transcriptomes are now available, from previous-hybridization based microarray methods. Epigenetic studies are also available with NGS applications such as whole genome methylation sequencing and chromatin immunoprecipitation followed by sequencing. Human genetics has faced a new paradigm of research and medical genomics by sequencing technologies since the Human Genome Project. The trend of NGS technologies in human genomics has brought a new era of WGS by enabling the building of human genomes databases and providing appropriate human reference genomes, which is a necessary component of personalized medicine and precision medicine.

  20. Trends in Next-Generation Sequencing and a New Era for Whole Genome Sequencing

    PubMed Central

    2016-01-01

    This article is a mini-review that provides a general overview for next-generation sequencing (NGS) and introduces one of the most popular NGS applications, whole genome sequencing (WGS), developed from the expansion of human genomics. NGS technology has brought massively high throughput sequencing data to bear on research questions, enabling a new era of genomic research. Development of bioinformatic software for NGS has provided more opportunities for researchers to use various applications in genomic fields. De novo genome assembly and large scale DNA resequencing to understand genomic variations are popular genomic research tools for processing a tremendous amount of data at low cost. Studies on transcriptomes are now available, from previous-hybridization based microarray methods. Epigenetic studies are also available with NGS applications such as whole genome methylation sequencing and chromatin immunoprecipitation followed by sequencing. Human genetics has faced a new paradigm of research and medical genomics by sequencing technologies since the Human Genome Project. The trend of NGS technologies in human genomics has brought a new era of WGS by enabling the building of human genomes databases and providing appropriate human reference genomes, which is a necessary component of personalized medicine and precision medicine. PMID:27915479

  1. Race, risk, and recreation in personal genomics: the limits of play.

    PubMed

    Lee, Sandra Soo-Jin

    2013-12-01

    Despite the mantra that genetics has moved beyond race, the burgeoning industry of genetic ancestry reveals how genetics has offered new technology through which individuals can link to intersections in time and space in complex ways that recapitulate understandings of racial order, origins, and group membership. This article focuses on the trope of "recreation" asserted in the marketing of ancestry genetic tests and examines the suggestion of self-discovery through the recovery of lost kin. Themes of recreation and re-creation paradoxically suggest both passivity of self-revelation and the power to re-act and re-create one's self in light of a different, more enlightened future. Direct-to-consumer personal genetics testing companies play guardian to this consumer play, providing tailored genetic scripts and highlighting how consumers might use their information. This article critically examines the play with concepts of ancestry, ethnicity, and genetic variation and their implications for public understanding of the relationship between race and genetics.

  2. Bioinformatics methods for the analysis of hepatitis viruses.

    PubMed

    Moriconi, Francesco; Beard, Michael R; Yuen, Lilly Kw

    2013-01-01

    HBV and HCV are the only hepatotropic viruses capable of establishing chronic infections. More than 500 million people worldwide are estimated to have chronic infections with HBV and/or HCV, and they have an increased risk of developing liver complications, such as cirrhosis or hepatocellular carcinoma. During the past decade, several antiviral agents including immune-modulatory drugs and nucleoside/nucleotide analogues have been approved for the treatment of HBV and HCV infections. In recent years, the focus has been on the development of new and better therapeutic agents for management of chronic HCV infections. Bioinformatics has only been applied recently to the field of viral hepatitis research. In addition to the wide range of general tools freely available for identification of open reading frames, gene prediction, homology searching, sequence alignment, and motif and epitope recognition, several public database systems designed specifically for HBV and HCV research have now been developed. The focus of these databases ranged from being viral sequence repositories for the provision of bioinformatics tools for viral genome analysis, as well as HBV or HCV drug resistance prediction. This review provides an overview of these public databases, which have integrated bioinformatics tools for HBV and HCV research. Properly managed and developed, these databases have the potential to have a broad effect on hepatitis research and treatment strategies. However, the effect will depend on the comprehensive collection of not only molecular sequence data, but also anonymous patient clinical and treatment data.

  3. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  4. A Mathematical Optimization Problem in Bioinformatics

    ERIC Educational Resources Information Center

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  5. The 2016 Bioinformatics Open Source Conference (BOSC)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J.A.; Chapman, Brad; Fields, Christopher J.; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science. PMID:27781083

  6. Bioinformatics clouds for big data manipulation

    PubMed Central

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475

  7. Integrative bioinformatic analyses of an oncogenomic profile reveal the biology of endometrial cancer and guide drug discovery.

    PubMed

    Wong, Henry Sung-Ching; Juan, Yung-Shun; Wu, Mei-Shin; Zhang, Yan-Feng; Hsu, Yu-Wen; Chen, Huang-Hui; Liu, Wei-Min; Chang, Wei-Chiao

    2016-02-02

    A major challenge in personalized cancer medicine is to establish a systematic approach to translate huge oncogenomic datasets to clinical situations and facilitate drug discovery for cancers such as endometrial carcinoma. We performed a genome-wide somatic mutation-expression association study in a total of 219 endometrial cancer patients from TCGA database, by evaluating the correlation between ~5,800 somatic mutations to ~13,500 gene expression levels (in total, ~78, 500, 000 pairs). A bioinformatics pipeline was devised to identify expression-associated single nucleotide variations (eSNVs) which are crucial for endometrial cancer progression and patient prognoses. We further prioritized 394 biologically risky mutational candidates which mapped to 275 gene loci and demonstrated that these genes collaborated with expression features were significantly enriched in targets of drugs approved for solid tumors, suggesting the plausibility of drug repurposing. Taken together, we integrated a fundamental endometrial cancer genomic profile into clinical circumstances, further shedding light for clinical implementation of genomic-based therapies and guidance for drug discovery.

  8. An "in silico" Bioinformatics Laboratory Manual for Bioscience Departments: "Prediction of Glycosylation Sites in Phosphoethanolamine Transferases"

    ERIC Educational Resources Information Center

    Alyuruk, Hakan; Cavas, Levent

    2014-01-01

    Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…

  9. Evolving Strategies for the Incorporation of Bioinformatics within the Undergraduate Cell Biology Curriculum

    ERIC Educational Resources Information Center

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…

  10. Incorporating a New Bioinformatics Component into Genetics at a Historically Black College: Outcomes and Lessons

    ERIC Educational Resources Information Center

    Holtzclaw, J. David; Eisen, Arri; Whitney, Erika M.; Penumetcha, Meera; Hoey, J. Joseph; Kimbro, K. Sean

    2006-01-01

    Many students at minority-serving institutions are underexposed to Internet resources such as the human genome project, PubMed, NCBI databases, and other Web-based technologies because of a lack of financial resources. To change this, we designed and implemented a new bioinformatics component to supplement the undergraduate Genetics course at…

  11. Strategies for Using Peer-Assisted Learning Effectively in an Undergraduate Bioinformatics Course

    ERIC Educational Resources Information Center

    Shapiro, Casey; Ayon, Carlos; Moberg-Parker, Jordan; Levis-Fitzgerald, Marc; Sanders, Erin R.

    2013-01-01

    This study used a mixed methods approach to evaluate hybrid peer-assisted learning approaches incorporated into a bioinformatics tutorial for a genome annotation research project. Quantitative and qualitative data were collected from undergraduates who enrolled in a research-based laboratory course during two different academic terms at UCLA.…

  12. Highlights of the 2 nd Bioinformatics Student Symposium by ISCB RSG-UK

    PubMed Central

    White, Benjamen; Fatima, Vayani; Fatima, Nazeefa; Das, Sayoni; Rahman, Farzana; Hassan, Mehedi

    2016-01-01

    Following the success of the 1 st Student Symposium by ISCB RSG-UK, a 2 nd Student Symposium took place on 7 th October 2015 at The Genome Analysis Centre, Norwich, UK. This short report summarizes the main highlights from the 2 nd Bioinformatics Student Symposium. PMID:27239284

  13. An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.

    PubMed

    Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir

    2013-01-01

    DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.

  14. Mining Cancer Transcriptomes: Bioinformatic Tools and the Remaining Challenges.

    PubMed

    Milan, Thomas; Wilhelm, Brian T

    2017-02-22

    The development of next-generation sequencing technologies has had a profound impact on the field of cancer genomics. With the enormous quantities of data being generated from tumor samples, researchers have had to rapidly adapt tools or develop new ones to analyse the raw data to maximize its value. While much of this effort has been focused on improving specific algorithms to get faster and more precise results, the accessibility of the final data for the research community remains a significant problem. Large amounts of data exist but are not easily available to researchers who lack the resources and experience to download and reanalyze them. In this article, we focus on RNA-seq analysis in the context of cancer genomics and discuss the bioinformatic tools available to explore these data. We also highlight the importance of developing new and more intuitive tools to provide easier access to public data and discuss the related issues of data sharing and patient privacy.

  15. The web server of IBM's Bioinformatics and Pattern Discovery group.

    PubMed

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-07-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  16. When cloud computing meets bioinformatics: a review.

    PubMed

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  17. Bioinformatics for Diagnostics, Forensics, and Virulence Characterization and Detection

    SciTech Connect

    Gardner, S; Slezak, T

    2005-04-05

    We summarize four of our group's high-risk/high-payoff research projects funded by the Intelligence Technology Innovation Center (ITIC) in conjunction with our DHS-funded pathogen informatics activities. These are (1) quantitative assessment of genomic sequencing needs to predict high quality DNA and protein signatures for detection, and comparison of draft versus finished sequences for diagnostic signature prediction; (2) development of forensic software to identify SNP and PCR-RFLP variations from a large number of viral pathogen sequences and optimization of the selection of markers for maximum discrimination of those sequences; (3) prediction of signatures for the detection of virulence, antibiotic resistance, and toxin genes and genetic engineering markers in bacteria; (4) bioinformatic characterization of virulence factors to rapidly screen genomic data for potential genes with similar functions and to elucidate potential health threats in novel organisms. The results of (1) are being used by policy makers to set national sequencing priorities. Analyses from (2) are being used in collaborations with the CDC to genotype and characterize many variola strains, and reports from these collaborations have been made to the President. We also determined SNPs for serotype and strain discrimination of 126 foot and mouth disease virus (FMDV) genomes. For (3), currently >1000 probes have been predicted for the specific detection of >4000 virulence, antibiotic resistance, and genetic engineering vector sequences, and we expect to complete the bioinformatic design of a comprehensive ''virulence detection chip'' by August 2005. Results of (4) will be a system to rapidly predict potential virulence pathways and phenotypes in organisms based on their genomic sequences.

  18. PATRIC, the bacterial bioinformatics database and analysis resource

    PubMed Central

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  19. A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data.

    PubMed

    Roumpeka, Despoina D; Wallace, R John; Escalettes, Frank; Fotheringham, Ian; Watson, Mick

    2017-01-01

    The microbiome can be defined as the community of microorganisms that live in a particular environment. Metagenomics is the practice of sequencing DNA from the genomes of all organisms present in a particular sample, and has become a common method for the study of microbiome population structure and function. Increasingly, researchers are finding novel genes encoded within metagenomes, many of which may be of interest to the biotechnology and pharmaceutical industries. However, such "bioprospecting" requires a suite of sophisticated bioinformatics tools to make sense of the data. This review summarizes the most commonly used bioinformatics tools for the assembly and annotation of metagenomic sequence data with the aim of discovering novel genes.

  20. The challenges of delivering bioinformatics training in the analysis of high-throughput data

    PubMed Central

    Carvalho, Benilton S.; Rustici, Gabriella

    2013-01-01

    High-throughput technologies are widely used in the field of functional genomics and used in an increasing number of applications. For many ‘wet lab’ scientists, the analysis of the large amount of data generated by such technologies is a major bottleneck that can only be overcome through very specialized training in advanced data analysis methodologies and the use of dedicated bioinformatics software tools. In this article, we wish to discuss the challenges related to delivering training in the analysis of high-throughput sequencing data and how we addressed these challenges in the hands-on training courses that we have developed at the European Bioinformatics Institute. PMID:23543353

  1. Human immunome, bioinformatic analyses using HLA supermotifs and the parasite genome, binding assays, studies of human T cell responses, and immunization of HLA-A*1101 transgenic mice including novel adjuvants provide a foundation for HLA-A03 restricted CD8+T cell epitope based, adjuvanted vaccine protective against Toxoplasma gondii

    PubMed Central

    2010-01-01

    Background Toxoplasmosis causes loss of life, cognitive and motor function, and sight. A vaccine is greatly needed to prevent this disease. The purpose of this study was to use an immmunosense approach to develop a foundation for development of vaccines to protect humans with the HLA-A03 supertype. Three peptides had been identified with high binding scores for HLA-A03 supertypes using bioinformatic algorhythms, high measured binding affinity for HLA-A03 supertype molecules, and ability to elicit IFN-γ production by human HLA-A03 supertype peripheral blood CD8+ T cells from seropositive but not seronegative persons. Results Herein, when these peptides were administered with the universal CD4+T cell epitope PADRE (AKFVAAWTLKAAA) and formulated as lipopeptides, or administered with GLA-SE either alone, or with Pam2Cys added, we found we successfully created preparations that induced IFN-γ and reduced parasite burden in HLA-A*1101(an HLA-A03 supertype allele) transgenic mice. GLA-SE is a novel emulsified synthetic TLR4 ligand that is known to facilitate development of T Helper 1 cell (TH1) responses. Then, so our peptides would include those expressed in tachyzoites, bradyzoites and sporozoites from both Type I and II parasites, we used our approaches which had identified the initial peptides. We identified additional peptides using bioinformatics, binding affinity assays, and study of responses of HLA-A03 human cells. Lastly, we found that immunization of HLA-A*1101 transgenic mice with all the pooled peptides administered with PADRE, GLA-SE, and Pam2Cys is an effective way to elicit IFN-γ producing CD8+ splenic T cells and protection. Immunizations included the following peptides together: KSFKDILPK (SAG1224-232); AMLTAFFLR (GRA6164-172); RSFKDLLKK (GRA7134-142); STFWPCLLR (SAG2C13-21); SSAYVFSVK(SPA250-258); and AVVSLLRLLK(SPA89-98). This immunization elicited robust protection, measured as reduced parasite burden using a luciferase transfected parasite

  2. Bioinformatics and the Undergraduate Curriculum Essay

    PubMed Central

    Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of bioinformatics as a new discipline has challenged many colleges and universities to keep current with their curricula, often in the face of static or dwindling resources. On the plus side, many bioinformatics modules and related databases and software programs are free and accessible online, and interdisciplinary partnerships between existing faculty members and their support staff have proved advantageous in such efforts. We present examples of strategies and methods that have been successfully used to incorporate bioinformatics content into undergraduate curricula. PMID:20810947

  3. Bioinformatics in Italy: BITS2011, the Eighth Annual Meeting of the Italian Society of Bioinformatics

    PubMed Central

    2012-01-01

    The BITS2011 meeting, held in Pisa on June 20-22, 2011, brought together more than 120 Italian researchers working in the field of Bioinformatics, as well as students in Bioinformatics, Computational Biology, Biology, Computer Sciences, and Engineering, representing a landscape of Italian bioinformatics research. This preface provides a brief overview of the meeting and introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement. PMID:22536954

  4. No-boundary thinking in bioinformatics research

    PubMed Central

    2013-01-01

    Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT). PMID:24192339

  5. Whole-genome sequencing for comparative genomics and de novo genome assembly.

    PubMed

    Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

    2015-01-01

    Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).

  6. GIW and InCoB, two premier bioinformatics conferences in Asia with a combined 40 years of history

    PubMed Central

    2015-01-01

    Knowledge discovery in bioinformatics thrives on joint and inclusive efforts of stakeholders. Similarly, knowledge dissemination is expected to be more effective and scalable through joint efforts. Therefore, the International Conference on Bioinformatics (InCoB) and the International Conference on Genome Informatics (GIW) were organized as a joint conference for the first time in 13 years of coexistence. The Asia-Pacific Bioinformatics Network (APBioNet) and the Japanese Society for Bioinformatics (JSBi) collaborated to host GIW/InCoB2015 in Tokyo, September 9-11, 2015. The joint endeavour yielded 51 research articles published in seven journals, 78 poster and 89 oral presentations, showcasing bioinformatics research in the Asia-Pacific region. Encouraged by the results and reduced organizational overheads, APBioNet will collaborate with other bioinformatics societies in organizing co-located bioinformatics research and training meetings in the future. InCoB2016 will be hosted in Singapore, September 21-23, 2016. PMID:26679412

  7. Regulatory bioinformatics for food and drug safety.

    PubMed

    Healy, Marion J; Tong, Weida; Ostroff, Stephen; Eichler, Hans-Georg; Patak, Alex; Neuspiel, Margaret; Deluyker, Hubert; Slikker, William

    2016-10-01

    "Regulatory Bioinformatics" strives to develop and implement a standardized and transparent bioinformatic framework to support the implementation of existing and emerging technologies in regulatory decision-making. It has great potential to improve public health through the development and use of clinically important medical products and tools to manage the safety of the food supply. However, the application of regulatory bioinformatics also poses new challenges and requires new knowledge and skill sets. In the latest Global Coalition on Regulatory Science Research (GCRSR) governed conference, Global Summit on Regulatory Science (GSRS2015), regulatory bioinformatics principles were presented with respect to global trends, initiatives and case studies. The discussion revealed that datasets, analytical tools, skills and expertise are rapidly developing, in many cases via large international collaborative consortia. It also revealed that significant research is still required to realize the potential applications of regulatory bioinformatics. While there is significant excitement in the possibilities offered by precision medicine to enhance treatments of serious and/or complex diseases, there is a clear need for further development of mechanisms to securely store, curate and share data, integrate databases, and standardized quality control and data analysis procedures. A greater understanding of the biological significance of the data is also required to fully exploit vast datasets that are becoming available. The application of bioinformatics in the microbiological risk analysis paradigm is delivering clear benefits both for the investigation of food borne pathogens and for decision making on clinically important treatments. It is recognized that regulatory bioinformatics will have many beneficial applications by ensuring high quality data, validated tools and standardized processes, which will help inform the regulatory science community of the requirements

  8. Teaching Bioinformatics and Neuroinformatics by Using Free Web-based Tools

    PubMed Central

    Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with anatomy (Mouse Brain Library), quantitative trait locus analysis (WebQTL from GeneNetwork), bioinformatics and gene expression analyses (University of California, Santa Cruz Genome Browser, National Center for Biotechnology Information's Entrez Gene, and the Allen Brain Atlas), and information resources (PubMed). Instructors can use these various websites in concert to teach genetics from the phenotypic level to the molecular level, aspects of neuroanatomy and histology, statistics, quantitative trait locus analysis, and molecular biology (including in situ hybridization and microarray analysis), and to introduce bioinformatic resources. Students use these resources to discover 1) the region(s) of chromosome(s) influencing the phenotypic trait, 2) a list of candidate genes—narrowed by expression data, 3) the in situ pattern of a given gene in the region of interest, 4) the nucleotide sequence of the candidate gene, and 5) articles describing the gene. Teaching materials such as a detailed student/instructor's manual, PowerPoints, sample exams, and links to free Web resources can be found at http://mdcune.psych.ucla.edu/modules/bioinformatics. PMID:20516355

  9. PATRIC: the Comprehensive Bacterial Bioinformatics Resource with a Focus on Human Pathogenic Species ▿ ‡ #

    PubMed Central

    Gillespie, Joseph J.; Wattam, Alice R.; Cammer, Stephen A.; Gabbard, Joseph L.; Shukla, Maulik P.; Dalay, Oral; Driscoll, Timothy; Hix, Deborah; Mane, Shrinivasrao P.; Mao, Chunhong; Nordberg, Eric K.; Scott, Mark; Schulman, Julie R.; Snyder, Eric E.; Sullivan, Daniel E.; Wang, Chunxia; Warren, Andrew; Williams, Kelly P.; Xue, Tian; Seung Yoo, Hyun; Zhang, Chengdong; Zhang, Yan; Will, Rebecca; Kenyon, Ronald W.; Sobral, Bruno W.

    2011-01-01

    Funded by the National Institute of Allergy and Infectious Diseases, the Pathosystems Resource Integration Center (PATRIC) is a genomics-centric relational database and bioinformatics resource designed to assist scientists in infectious-disease research. Specifically, PATRIC provides scientists with (i) a comprehensive bacterial genomics database, (ii) a plethora of associated data relevant to genomic analysis, and (iii) an extensive suite of computational tools and platforms for bioinformatics analysis. While the primary aim of PATRIC is to advance the knowledge underlying the biology of human pathogens, all publicly available genome-scale data for bacteria are compiled and continually updated, thereby enabling comparative analyses to reveal the basis for differences between infectious free-living and commensal species. Herein we summarize the major features available at PATRIC, dividing the resources into two major categories: (i) organisms, genomes, and comparative genomics and (ii) recurrent integration of community-derived associated data. Additionally, we present two experimental designs typical of bacterial genomics research and report on the execution of both projects using only PATRIC data and tools. These applications encompass a broad range of the data and analysis tools available, illustrating practical uses of PATRIC for the biologist. Finally, a summary of PATRIC's outreach activities, collaborative endeavors, and future research directions is provided. PMID:21896772

  10. Can bioinformatics help in the identification of moonlighting proteins?

    PubMed

    Hernández, Sergio; Calvo, Alejandra; Ferragut, Gabriela; Franco, Luís; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2014-12-01

    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place.

  11. The MPI Bioinformatics Toolkit for protein sequence analysis

    PubMed Central

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N.

    2006-01-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at . PMID:16845021

  12. The MPI Bioinformatics Toolkit for protein sequence analysis.

    PubMed

    Biegert, Andreas; Mayer, Christian; Remmert, Michael; Söding, Johannes; Lupas, Andrei N

    2006-07-01

    The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at http://toolkit.tuebingen.mpg.de.

  13. Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks.

    PubMed

    Pauling, Josch; Klipp, Edda

    2016-12-22

    Lipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.

  14. Genomic messaging system and DNA mark-up language for information-based personalized medicine with clinical and proteome research applications.

    PubMed

    Robson, Barry; Mushlin, Richard

    2004-01-01

    The convergence of clinical medicine and the Life Sciences, commencing with opportunities in clinical trials and clinically linked medical research, presents many novel challenges. The Genomic Messaging System (GMS) described here was originally developed as a tool for assembling clinical genomic records of individual and collective patients, and was then generalized to become a flexible workflow component that will link clinical records to a variety of computational biology research tools, for research and ultimately for a more personalized, focused, and preventative healthcare system. Prominent among the applications linked are protein science applications, including the rapid automated modeling of patient proteins with their individual structural polymorphisms. In an initial study, GMS formed the basis of a fully automated system for modeling patient proteins with structural polymorphisms as a basis for drug selection and ultimately design on an individual patient basis.

  15. Bioinformatic characterization of plant networks

    SciTech Connect

    McDermott, Jason E.; Samudrala, Ram

    2008-06-30

    Cells and organisms are governed by networks of interactions, genetic, physical and metabolic. Large-scale experimental studies of interactions between components of biological systems have been performed for a variety of eukaryotic organisms. However, there is a dearth of such data for plants. Computational methods for prediction of relationships between proteins, primarily based on comparative genomics, provide a useful systems-level view of cellular functioning and can be used to extend information about other eukaryotes to plants. We have predicted networks for Arabidopsis thaliana, Oryza sativa indica and japonica and several plant pathogens using the Bioverse (http://bioverse.compbio.washington.edu) and show that they are similar to experimentally-derived interaction networks. Predicted interaction networks for plants can be used to provide novel functional annotations and predictions about plant phenotypes and aid in rational engineering of biosynthesis pathways.

  16. Building International Genomics Collaboration for Global Health Security

    PubMed Central

    Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.; Vuyisich, Momchilo

    2015-01-01

    Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installation of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries. PMID:26697418

  17. Building international genomics collaboration for global health security

    SciTech Connect

    Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.; Vuyisich, Momchilo

    2015-12-07

    Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installation of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.

  18. The European Bioinformatics Institute’s data resources 2014

    PubMed Central

    Brooksbank, Catherine; Bergman, Mary Todd; Apweiler, Rolf; Birney, Ewan; Thornton, Janet

    2014-01-01

    Molecular Biology has been at the heart of the ‘big data’ revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff’s ‘Atlas of Protein Sequence and Structure’ through the Human Genome Project in the late 1990s and early 2000s to today’s population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI’s database collection to complement the reviews of individual databases provided elsewhere in this issue. PMID:24271396

  19. One Size Doesn't Fit All - RefEditor: Building Personalized Diploid Reference Genome to Improve Read Mapping and Genotype Calling in Next Generation Sequencing Studies.

    PubMed

    Yuan, Shuai; Johnston, H Richard; Zhang, Guosheng; Li, Yun; Hu, Yi-Juan; Qin, Zhaohui S

    2015-08-01

    With rapid decline of the sequencing cost, researchers today rush to embrace whole genome sequencing (WGS), or whole exome sequencing (WES) approach as the next powerful tool for relating genetic variants to human diseases and phenotypes. A fundamental step in analyzing WGS and WES data is mapping short sequencing reads back to the reference genome. This is an important issue because incorrectly mapped reads affect the downstream variant discovery, genotype calling and association analysis. Although many read mapping algorithms have been developed, the majority of them uses the universal reference genome and do not take sequence variants into consideration. Given that genetic variants are ubiquitous, it is highly desirable if they can be factored into the read mapping procedure. In this work, we developed a novel strategy that utilizes genotypes obtained a priori to customize the universal haploid reference genome into a personalized diploid reference genome. The new strategy is implemented in a program named RefEditor. When applying RefEditor to real data, we achieved encouraging improvements in read mapping, variant discovery and genotype calling. Compared to standard approaches, RefEditor can significantly increase genotype calling consistency (from 43% to 61% at 4X coverage; from 82% to 92% at 20X coverage) and reduce Mendelian inconsistency across various sequencing depths. Because many WGS and WES studies are conducted on cohorts that have been genotyped using array-based genotyping platforms previously or concurrently, we believe the proposed strategy will be of high value in practice, which can also be applied to the scenario where multiple NGS experiments are conducted on the same cohort. The RefEditor sources are available at https://github.com/superyuan/refeditor.

  20. An emerging place for lung cancer genomics in 2013

    PubMed Central

    Bowman, Rayleen V.; Yang, Ian A.; Govindan, Ramaswamy; Fong, Kwun M.

    2013-01-01

    Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could

  1. An emerging place for lung cancer genomics in 2013.

    PubMed

    Daniels, Marissa G; Bowman, Rayleen V; Yang, Ian A; Govindan, Ramaswamy; Fong, Kwun M

    2013-10-01

    Lung cancer is a disease with a dismal prognosis and is the biggest cause of cancer deaths in many countries. Nonetheless, rapid technological developments in genome science promise more effective prevention and treatment strategies. Since the Human Genome Project, scientific advances have revolutionized the diagnosis and treatment of human cancers, including thoracic cancers. The latest, massively parallel, next generation sequencing (NGS) technologies offer much greater sequencing capacity than traditional, capillary-based Sanger sequencing. These modern but costly technologies have been applied to whole genome-, and whole exome sequencing (WGS and WES) for the discovery of mutations and polymorphisms, transcriptome sequencing for quantification of gene expression, small ribonucleic acid (RNA) sequencing for microRNA profiling, large scale analysis of deoxyribonucleic acid (DNA) methylation and chromatin immunoprecipitation mapping of DNA-protein interaction. With the rise of personalized cancer care, based on the premise of precision medicine, sequencing technologies are constantly changing. To date, the genomic landscape of lung cancer has been captured in several WGS projects. Such work has not only contributed to our understanding of cancer biology, but has also provided impetus for technical advances that may improve our ability to accurately capture the cancer genome. Issues such as short read lengths contribute to sequenced libraries that contain challenging gaps in the aligned genome. Emerging platforms promise longer reads as well as the ability to capture a range of epigenomic signals. In addition, ongoing optimization of bioinformatics strategies for data analysis and interpretation are critical, especially for the differentiation between driver and passenger mutations. Moreover, broader deployment of these and future generations of platforms, coupled with an increasing bioinformatics workforce with access to highly sophisticated technologies, could

  2. How to enhance integrated care towards the personal health paradigm?

    PubMed

    Blobel, Bernd G M E; Pharow, Peter; Norgall, Thomas

    2007-01-01

    For improving quality and efficiency of health delivery under the well-known burdens, the health service paradigm has to change from organization-centered over process-controlled to personal health. The growing complexity of highly distributed and fully integrated healthcare settings can only be managed through an advanced architectural approach, which has to include all dimensions of personal health. Here, ICT, medicine, biomedical engineering, bioinformatics and genomics, legal and administrative aspects, terminology and ontology have to be mentioned. The Generic Component Model allows for different domains' concept representation and aggregation. Framework, requirements, methodology and process design possibilities for such a future-proof and meanwhile practically demonstrated approach are discussed in detail. The deployment of the Generic Component Model and the concept representation to biomedical engineering aspects of eHealth are touched upon as essential issues.

  3. Molecular Dynamics: New Frontier in Personalized Medicine.

    PubMed

    Sneha, P; Doss, C George Priya

    2016-01-01

    The field of drug discovery has witnessed infinite development over the last decade with the demand for discovery of novel efficient lead compounds. Although the development of novel compounds in this field has seen large failure, a breakthrough in this area might be the establishment of personalized medicine. The trend of personalized medicine has shown stupendous growth being a hot topic after the successful completion of Human Genome Project and 1000 genomes pilot project. Genomic variant such as SNPs play a vital role with respect to inter individual's disease susceptibility and drug response. Hence, identification of such genetic variants has to be performed before administration of a drug. This process requires high-end techniques to understand the complexity of the molecules which might bring an insight to understand the compounds at their molecular level. To sustenance this, field of bioinformatics plays a crucial role in revealing the molecular mechanism of the mutation and thereby designing a drug for an individual in fast and affordable manner. High-end computational methods, such as molecular dynamics (MD) simulation has proved to be a constitutive approach to detecting the minor changes associated with an SNP for better understanding of the structural and functional relationship. The parameters used in molecular dynamic simulation elucidate different properties of a macromolecule, such as protein stability and flexibility. MD along with docking analysis can reveal the synergetic effect of an SNP in protein-ligand interaction and provides a foundation for designing a particular drug molecule for an individual. This compelling application of computational power and the advent of other technologies have paved a promising way toward personalized medicine. In this in-depth review, we tried to highlight the different wings of MD toward personalized medicine.

  4. Bioinformatics: A History of Evolution "In Silico"

    ERIC Educational Resources Information Center

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  5. Medical informatics and bioinformatics: a bibliometric study

    PubMed Central

    Bansard, Jean-Yves; Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Beltrame, Francesco; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Tollis, Ioannis; Van der Lei, Johan; Coatrieux, Jean-Louis

    2007-01-01

    This paper reports on an analysis of the bioinformatics and medical informatics literature with the objective to identify upcoming trends that are shared among both research fields to derive benefits from potential collaborative initiatives for their future. Our results present the main characteristics of the two fields and show that these domains are still relatively separated. PMID:17521073

  6. "Extreme Programming" in a Bioinformatics Class

    ERIC Educational Resources Information Center

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  7. Implementing bioinformatic workflows within the bioextract server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  8. Bioinformatics in Undergraduate Education: Practical Examples

    ERIC Educational Resources Information Center

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  9. Bioboxes: standardised containers for interchangeable bioinformatics software.

    PubMed

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  10. KDE Bioscience: platform for bioinformatics analysis workflows.

    PubMed

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research.

  11. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    EPA Science Inventory

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  12. Health Orientation, Knowledge, and Attitudes toward Genetic Testing and Personalized Genomic Services: Preliminary Data from an Italian Sample

    PubMed Central

    Arnaboldi, Paola; Cutica, Ilaria; Fioretti, Chiara

    2016-01-01

    Objective. The study aims at assessing personality tendencies and orientations that could be closely correlated with knowledge, awareness, and interest toward undergoing genetic testing. Methods. A sample of 145 subjects in Italy completed an online survey, investigating demographic data, health orientation, level of perceived knowledge about genetic risk, genetic screening, and personal attitudes toward direct to consumer genetic testing (DTCGT). Results. Results showed that respondents considered genetic assessment to be helpful for disease prevention, but they were concerned that results could affect their life planning with little clinical utility. Furthermore, a very high percentage of respondents (67%) had never heard about genetic testing directly available to the public. Data showed that personality tendencies, such as personal health consciousness, health internal control, health esteem, and confidence, motivation to avoid unhealthiness and motivation for healthiness affected the uptake of genetic information and the interest in undergoing genetic testing. Conclusions. Public knowledge and attitudes toward genetic risk and genetic testing among European countries, along with individual personality and psychological tendencies that could affect these attitudes, remain unexplored. The present study constitutes one of the first attempts to investigate how such personality tendencies could motivation to undergo genetic testing and engagement in lifestyle changes. PMID:28105428

  13. High-throughput next-generation sequencing technologies foster new cutting-edge computing techniques in bioinformatics.

    PubMed

    Yang, Mary Qu; Athey, Brian D; Arabnia, Hamid R; Sung, Andrew H; Liu, Qingzhong; Yang, Jack Y; Mao, Jinghe; Deng, Youping

    2009-07-07

    The advent of high-throughput next generation sequencing technologies have fostered enormous potential applications of supercomputing techniques in genome sequencing, epi-genetics, metagenomics, personalized medicine, discovery of non-coding RNAs and protein-binding sites. To this end, the 2008 International Conference on Bioinformatics and Computational Biology (Biocomp) - 2008 World Congress on Computer Science, Computer Engineering and Applied Computing (Worldcomp) was designed to promote synergistic inter/multidisciplinary research and education in response to the current research trends and advances. The conference attracted more than two thousand scientists, medical doctors, engineers, professors and students gathered at Las Vegas, Nevada, USA during July 14-17 and received great success. Supported by International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design (IJCBDD), International Journal of Functional Informatics and Personalized Medicine (IJFIPM) and the leading research laboratories from Harvard, M.I.T., Purdue, UIUC, UCLA, Georgia Tech, UT Austin, U. of Minnesota, U. of Iowa etc, the conference received thousands of research papers. Each submitted paper was reviewed by at least three reviewers and accepted papers were required to satisfy reviewers' comments. Finally, the review board and the committee decided to select only 19 high-quality research papers for inclusion in this supplement to BMC Genomics based on the peer reviews only. The conference committee was very grateful for the Plenary Keynote Lectures given by: Dr. Brian D. Athey (University of Michigan Medical School), Dr. Vladimir N. Uversky (Indiana University School of Medicine), Dr. David A. Patterson (Member of United States National Academy of Sciences and National Academy of Engineering, University of California at Berkeley) and Anousheh Ansari (Prodea Systems, Space Ambassador). The theme of the conference to promote

  14. Navigating the changing learning landscape: perspective from bioinformatics.ca

    PubMed Central

    Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  15. [Towards a supranational integration of the guiding principles on the human genome. A personal view from the Latin American perspective].

    PubMed

    Figueroa Yañez, G

    2001-01-01

    All human groupings draw up a range of values and valuational ideals. Man alone draws up values and it is these that confer content on cultures. In this particular subject matter one might speak of guiding principles regarding the Human Genome. These guiding principles are ethical values, recognised as such by the vast majority of humankind in the Western Christian culture of our time.

  16. Probing the diversity of healthy oral microbiome with bioinformatics approaches.

    PubMed

    Moon, Ji-Hoi; Lee, Jae-Hyung

    2016-12-01

    The human oral cavity contains a highly personalized microbiome essential to maintaining health, but capable of causing oral and systemic diseases. Thus, an in-depth definition of "healthy oral microbiome" is critical to understanding variations in disease states from preclinical conditions, and disease onset through progressive states of disease. With rapid advances in DNA sequencing and analytical technologies, population-based studies have documented the range and diversity of both taxonomic compositions and functional potentials observed in the oral microbiome in healthy individuals. Besides factors specific to the host, such as age and race/ethnicity, environmental factors also appear to contribute to the variability of the healthy oral microbiome. Here, we review bioinformatic techniques for metagenomic datasets, including their strengths and limitations. In addition, we summarize the interpersonal and intrapersonal diversity of the oral microbiome, taking into consideration the recent large-scale and longitudinal studies, including the Human Microbiome Project. [BMB Reports 2016; 49(12): 662-670].

  17. Probing the diversity of healthy oral microbiome with bioinformatics approaches

    PubMed Central

    Moon, Ji-Hoi; Lee, Jae-Hyung

    2016-01-01

    The human oral cavity contains a highly personalized microbiome essential to maintaining health, but capable of causing oral and systemic diseases. Thus, an in-depth definition of “healthy oral microbiome” is critical to understanding variations in disease states from preclinical conditions, and disease onset through progressive states of disease. With rapid advances in DNA sequencing and analytical technologies, population-based studies have documented the range and diversity of both taxonomic compositions and functional potentials observed in the oral microbiome in healthy individuals. Besides factors specific to the host, such as age and race/ethnicity, environmental factors also appear to contribute to the variability of the healthy oral microbiome. Here, we review bioinformatic techniques for metagenomic datasets, including their strengths and limitations. In addition, we summarize the interpersonal and intrapersonal diversity of the oral microbiome, taking into consideration the recent large-scale and longitudinal studies, including the Human Microbiome Project. PMID:27697111

  18. Cardiovascular genomics, personalized medicine, and the National Heart, Lung, and Blood Institute: part I: the beginning of an era.

    PubMed

    O'Donnell, Christopher J; Nabel, Elizabeth G

    2008-10-01

    The inaugural issue of Circulation: Cardiovascular Genetics arrives at a remarkable time in the history of genetic research and cardiovascular medicine. Despite tremendous progress in knowledge gained, cardiovascular disease(CVD) remains the leading cause of death in the United States,1 and it has overcome infectious diseases as the leading cause of death worldwide.2 In addition, rates of CVD remain higher in black and Hispanic populations in the United States.1 The recent Strategic Plan of the National Heart, Lung,and Blood Institute (NHLBI) emphasizes research areas to fill the significant knowledge gaps needed to improve the diagnosis,treatment, and control of known risk factors and clinically apparent disease. Simultaneously, the NHLBI Strategic Plan recognizes a tremendous opportunity that is available for use of genetic and genomic research to generate new knowledge that might reduce the morbidity and mortality from CVD in US populations.3 Public availability of vast amounts of detailed sequence information about the human genome, completed sequence data on dozens of other animal genomes, and private sector development of high-throughput genetic technologies has transformed in a few short years the conduct of cardiovascular genetics and genomics research from a primary focus on mendelian disorders to a current emphasis on genome-wide association studies (GWAS; Figure1). In this review, we describe the rationale for the current emphasis on large-scale genomic studies, summarize the evolving approaches and progress to date, and identify immediate-term research needs. The National Institutes of Health (NIH) and the NHLBI are supporting a portfolio of large-scale genetic and genomic programs in diverse US populations with the longer-term objective of translating knowledge into the prediction, prevention, and preemption of CVD, as well as lung, sleep, and blood disorders. Underlying this portfolio is a strong commitment to make available participant-level data and

  19. A Bioinformatics Reference Model: Towards a Framework for Developing and Organising Bioinformatic Resources

    NASA Astrophysics Data System (ADS)

    Hiew, Hong Liang; Bellgard, Matthew

    2007-11-01

    Life Science research faces the constant challenge of how to effectively handle an ever-growing body of bioinformatics software and online resources. The users and developers of bioinformatics resources have a diverse set of competing demands on how these resources need to be developed and organised. Unfortunately, there does not exist an adequate community-wide framework to integrate such competing demands. The problems that arise from this include unstructured standards development, the emergence of tools that do not meet specific needs of researchers, and often times a communications gap between those who use the tools and those who supply them. This paper presents an overview of the different functions and needs of bioinformatics stakeholders to determine what may be required in a community-wide framework. A Bioinformatics Reference Model is proposed as a basis for such a framework. The reference model outlines the functional relationship between research usage and technical aspects of bioinformatics resources. It separates important functions into multiple structured layers, clarifies how they relate to each other, and highlights the gaps that need to be addressed for progress towards a diverse, manageable, and sustainable body of resources. The relevance of this reference model to the bioscience research community, and its implications in progress for organising our bioinformatics resources, are discussed.

  20. Component-Based Approach for Educating Students in Bioinformatics

    ERIC Educational Resources Information Center

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  1. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis

    PubMed Central

    Noar, Roslyn D.; Daub, Margaret E.

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  2. Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces

    SciTech Connect

    Ngu, A; Rocco, D; Critchlow, T; Buttler, D

    2003-12-22

    The World Wide Web provides a vast resource to genomics researchers in the form of web-based access to distributed data sources--e.g. BLAST sequence homology search interfaces. However, the process for seeking the desired scientific information is still very tedious and frustrating. While there are several known servers on genomic data (e.g., GeneBank, EMBL, NCBI), that are shared and accessed frequently, new data sources are created each day in laboratories all over the world. The sharing of these newly discovered genomics results are hindered by the lack of a common interface or data exchange mechanism. Moreover, the number of autonomous genomics sources and their rate of change out-pace the speed at which they can be manually identified, meaning that the available data is not being utilized to its full potential. An automated system that can find, classify, describe and wrap new sources without tedious and low-level coding of source specific wrappers is needed to assist scientists to access to hundreds of dynamically changing bioinformatics web data sources through a single interface. A correct classification of any kind of Web data source must address both the capability of the source and the conversation/interaction semantics which is inherent in the design of the Web data source. In this paper, we propose an automatic approach to classify Web data sources that takes into account both the capability and the conversational semantics of the source. The ability to discover the interaction pattern of a Web source leads to increased accuracy in the classification process. At the same time, it facilitates the extraction of process semantics, which is necessary for the automatic generation of wrappers that can interact correctly with the sources.

  3. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    PubMed Central

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  4. Nanoinformatics: an emerging area of information technology at the intersection of bioinformatics, computational chemistry and nanobiotechnology.

    PubMed

    González-Nilo, Fernando; Pérez-Acle, Tomás; Guínez-Molinos, Sergio; Geraldo, Daniela A; Sandoval, Claudia; Yévenes, Alejandro; Santos, Leonardo S; Laurie, V Felipe; Mendoza, Hegaly; Cachau, Raúl E

    2011-01-01

    After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.

  5. Genomics and Privacy: Implications of the New Reality of Closed Data for the Field

    PubMed Central

    Greenbaum, Dov; Sboner, Andrea; Mu, Xinmeng Jasmine; Gerstein, Mark

    2011-01-01

    Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can “slice” and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches—for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects

  6. Genomics and privacy: implications of the new reality of closed data for the field.

    PubMed

    Greenbaum, Dov; Sboner, Andrea; Mu, Xinmeng Jasmine; Gerstein, Mark

    2011-12-01

    Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can "slice" and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches-for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the

  7. Machine learning: an indispensable tool in bioinformatics.

    PubMed

    Inza, Iñaki; Calvo, Borja; Armañanzas, Rubén; Bengoetxea, Endika; Larrañaga, Pedro; Lozano, José A

    2010-01-01

    The increase in the number and complexity of biological databases has raised the need for modern and powerful data analysis tools and techniques. In order to fulfill these requirements, the machine learning discipline has become an everyday tool in bio-laboratories. The use of machine learning techniques has been extended to a wide spectrum of bioinformatics applications. It is broadly used to investigate the underlying mechanisms and interactions between biological molecules in many diseases, and it is an essential tool in any biomarker discovery process. In this chapter, we provide a basic taxonomy of machine learning algorithms, and the characteristics of main data preprocessing, supervised classification, and clustering techniques are shown. Feature selection, classifier evaluation, and two supervised classification topics that have a deep impact on current bioinformatics are presented. We make the interested reader aware of a set of popular web resources, open source software tools, and benchmarking data repositories that are frequently used by the machine learning community.

  8. A toolbox for developing bioinformatics software

    PubMed Central

    Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.

    2012-01-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  9. Discovery and Classification of Bioinformatics Web Services

    SciTech Connect

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  10. A toolbox for developing bioinformatics software.

    PubMed

    Rother, Kristian; Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M

    2012-03-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers.

  11. Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool

    PubMed Central

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N.; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. PMID:25742008

  12. The growing need for microservices in bioinformatics

    PubMed Central

    Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.

    2016-01-01

    Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective

  13. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  14. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  15. Bioinformatics approaches to single-cell analysis in developmental biology.

    PubMed

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology.

  16. Bioinformatic Analysis of HIV-1 Entry and Pathogenesis

    PubMed Central

    Aiamkitsumrit, Benjamas; Dampier, Will; Antell, Gregory; Rivera, Nina; Martin-Garcia, Julio; Pirrone, Vanessa; Nonnemacher, Michael R.; Wigdahl, Brian

    2015-01-01

    The evolution of human immunodeficiency virus type 1 (HIV-1) with respect to co-receptor utilization has been shown to be relevant to HIV-1 pathogenesis and disease. The CCR5-utilizing (R5) virus has been shown to be important in the very early stages of transmission and highly prevalent during asymptomatic infection and chronic disease. In addition, the R5 virus has been proposed to be involved in neuroinvasion and central nervous system (CNS) disease. In contrast, the CXCR4-utilizing (X4) virus is more prevalent during the course of disease progression and concurrent with the loss of CD4+ T cells. The dual-tropic virus is able to utilize both co-receptors (CXCR4 and CCR5) and has been thought to represent an intermediate transitional virus that possesses properties of both X4 and R5 viruses that can be encountered at many stages of disease. The use of computational tools and bioinformatic approaches in the prediction of HIV-1 co-receptor usage has been growing in importance with respect to understanding HIV-1 pathogenesis and disease, developing diagnostic tools, and improving the efficacy of therapeutic strategies focused on blocking viral entry. Current strategies have enhanced the sensitivity, specificity, and reproducibility relative to the prediction of co-receptor use; however, these technologies need to be improved with respect to their efficient and accurate use across the HIV-1 subtypes. The most effective approach may center on the combined use of different algorithms involving sequences within and outside of the env-V3 loop. This review focuses on the HIV-1 entry process and on co-receptor utilization, including bioinformatic tools utilized in the prediction of co-receptor usage. It also provides novel preliminary analyses for enabling identification of linkages between amino acids in V3 with other components of the HIV-1 genome and demonstrates that these linkages are different between X4 and R5 viruses. PMID:24862329

  17. Hydroxysteroid dehydrogenases (HSDs) in bacteria: a bioinformatic perspective.

    PubMed

    Kisiela, Michael; Skarka, Adam; Ebert, Bettina; Maser, Edmund

    2012-03-01

    Steroidal compounds including cholesterol, bile acids and steroid hormones play a central role in various physiological processes such as cell signaling, growth, reproduction, and energy homeostasis. Hydroxysteroid dehydrogenases (HSDs), which belong to the superfamily of short-chain dehydrogenases/reductases (SDR) or aldo-keto reductases (AKR), are important enzymes involved in the steroid hormone metabolism. HSDs function as an enzymatic switch that controls the access of receptor-active steroids to nuclear hormone receptors and thereby mediate a fine-tuning of the steroid response. The aim of this study was the identification of classified functional HSDs and the bioinformatic annotation of these proteins in all complete sequenced bacterial genomes followed by a phylogenetic analysis. For the bioinformatic annotation we constructed specific hidden Markov models in an iterative approach to provide a reliable identification for the specific catalytic groups of HSDs. Here, we show a detailed phylogenetic analysis of 3α-, 7α-, 12α-HSDs and two further functional related enzymes (3-ketosteroid-Δ(1)-dehydrogenase, 3-ketosteroid-Δ(4)(5α)-dehydrogenase) from the superfamily of SDRs. For some bacteria that have been previously reported to posses a specific HSD activity, we could annotate the corresponding HSD protein. The dominating phyla that were identified to express HSDs were that of Actinobacteria, Proteobacteria, and Firmicutes. Moreover, some evolutionarily more ancient microorganisms (e.g., Cyanobacteria and Euryachaeota) were found as well. A large number of HSD-expressing bacteria constitute the normal human gastro-intestinal flora. Another group of bacteria were originally isolated from natural habitats like seawater, soil, marine and permafrost sediments. These bacteria include polycyclic aromatic hydrocarbons-degrading species such as Pseudomonas, Burkholderia and Rhodococcus. In conclusion, HSDs are found in a wide variety of microorganisms including

  18. Adapting bioinformatics curricula for big data

    PubMed Central

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  19. Bioinformatic pipelines in Python with Leaf

    PubMed Central

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315

  20. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  1. Bringing Web 2.0 to bioinformatics.

    PubMed

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  2. [Ethical issues raised by direct-to-consumer personal genome analysis and whole body scans: discussion and contextualisation of a report by the Nuffield Council on Bioethics].

    PubMed

    Buyx, Alena M; Strech, Daniel; Schmidt, Harald

    2012-01-01

    The paradigm of personalised medicine has many different facets, further to the application of pharmacogenetics. We examine here (direct-to-consumer) personal genome analysis and whole body scans and summarise findings from the Nuffield Council's on Bioethics recent report "Medical profiling and online medicine: the ethics of 'personalised healthcare' in a consumer age". We describe the current situation in Germany with regard to access to such services, and contextualise the Nuffield Council's report with summaries of position statements by German professional bodies. We conclude with three points that merit examination further to the analyses of the Nuffield Council's report and the German professional bodies. These concern the role of indirect evidence in considering restrictive policies, the question of whether regulations should require commercial providers to contribute to the generation of better evidence, and the option of using data from evaluations in combination with indirect evidence in justifying restrictive policies.

  3. Interoperability of GADU in using heterogeneous Grid resources for bioinformatics applications.

    SciTech Connect

    Sulakhe, D.; Rodriguez, A.; Wilde, M.; Foster, I.; Maltsev, N.; Univ. of Chicago

    2008-03-01

    Bioinformatics tools used for efficient and computationally intensive analysis of genetic sequences require large-scale computational resources to accommodate the growing data. Grid computational resources such as the Open Science Grid and TeraGrid have proved useful for scientific discovery. The genome analysis and database update system (GADU) is a high-throughput computational system developed to automate the steps involved in accessing the Grid resources for running bioinformatics applications. This paper describes the requirements for building an automated scalable system such as GADU that can run jobs on different Grids. The paper describes the resource-independent configuration of GADU using the Pegasus-based virtual data system that makes high-throughput computational tools interoperable on heterogeneous Grid resources. The paper also highlights the features implemented to make GADU a gateway to computationally intensive bioinformatics applications on the Grid. The paper will not go into the details of problems involved or the lessons learned in using individual Grid resources as it has already been published in our paper on genome analysis research environment (GNARE) and will focus primarily on the architecture that makes GADU resource independent and interoperable across heterogeneous Grid resources.

  4. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    NASA Astrophysics Data System (ADS)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  5. A genome-wide linkage analysis for the personality trait neuroticism in the Irish affected sib-pair study of alcohol dependence.

    PubMed

    Kuo, Po-Hsiu; Neale, Michael C; Riley, Brien P; Patterson, Diana G; Walsh, Dermot; Prescott, Carol A; Kendler, Kenneth S

    2007-06-05

    Neuroticism is a personality trait which reflects individual differences in emotional stability and vulnerability to stress and anxiety. Consistent evidence shows substantial genetic influences on variation in this trait. The present study seeks to identify regions containing susceptibility loci for neuroticism using a selected sib-pair sample from Ireland. Using Merlin regress, we conducted a 4 cM whole-genome linkage analysis on 714 sib-pairs. Evidence for linkage to neuroticism was found on chromosomes 11p, 12q, and 15q. The highest linkage peak was on 12q at marker D12S1638 with a Lod score of 2.13 (-log p = 2.76, empirical P-value <0.001). Our data also support gender specific loci for neuroticism, with male specific linkage regions on chromosomes 1, 4, 11, 12, 15, 16, and 22, and female specific linkage regions on chromosomes 2, 4, 9, 12, 13, and 18. Some genome regions reported in the present study replicate findings from previous linkage studies of neuroticism. These results, together with prior studies, indicate several potential regions for quantitative trait loci for neuroticism that warrant further study.

  6. CFGP 2.0: a versatile web-based platform for supporting comparative and evolutionary genomics of fungi and Oomycetes.

    PubMed

    Choi, Jaeyoung; Cheong, Kyeongchae; Jung, Kyongyong; Jeon, Jongbum; Lee, Gir-Won; Kang, Seogchan; Kim, Sangsoo; Lee, Yin-Won; Lee, Yong-Hwan

    2013-01-01

    In 2007, Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr/) was publicly open with 65 genomes corresponding to 58 fungal and Oomycete species. The CFGP provided six bioinformatics tools, including a novel tool entitled BLASTMatrix that enables search homologous genes to queries in multiple species simultaneously. CFGP also introduced Favorite, a personalized virtual space for data storage and analysis with these six tools. Since 2007, CFGP has grown to archive 283 genomes corresponding to 152 fungal and Oomycete species as well as 201 genomes that correspond to seven bacteria, 39 plants and 105 animals. In addition, the number of tools in Favorite increased to 27. The Taxonomy Browser of CFGP 2.0 allows users to interactively navigate through a large number of genomes according to their taxonomic positions. The user interface of BLASTMatrix was also improved to facilitate subsequent analyses of retrieved data. A newly developed genome browser, Seoul National University Genome Browser (SNUGB), was integrated into CFGP 2.0 to support graphical presentation of diverse genomic contexts. Based on the standardized genome warehouse of CFGP 2.0, several systematic platforms designed to support studies on selected gene families have been developed. Most of them are connected through Favorite to allow of sharing data across the platforms.

  7. A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data

    PubMed Central

    Roumpeka, Despoina D.; Wallace, R. John; Escalettes, Frank; Fotheringham, Ian; Watson, Mick

    2017-01-01

    The microbiome can be defined as the community of microorganisms that live in a particular environment. Metagenomics is the practice of sequencing DNA from the genomes of all organisms present in a particular sample, and has become a common method for the study of microbiome population structure and function. Increasingly, researchers are finding novel genes encoded within metagenomes, many of which may be of interest to the biotechnology and pharmaceutical industries. However, such “bioprospecting” requires a suite of sophisticated bioinformatics tools to make sense of the data. This review summarizes the most commonly used bioinformatics tools for the assembly and annotation of metagenomic sequence data with the aim of discovering novel genes. PMID:28321234

  8. Optimizing selection of microsatellite loci from 454 pyrosequencing via post-sequencing bioinformatic analyses.

    PubMed

    Fernandez-Silva, Iria; Toonen, Robert J

    2013-01-01

    The comparatively low cost of massive parallel sequencing technology, also known as next-generation sequencing (NGS), has transformed the isolation of microsatellite loci. The most common NGS approach consists of obtaining large amounts of sequence data from genomic DNA or enriched microsatellite libraries, which is then mined for the discovery of microsatellite repeats using bioinformatics analyses. Here, we describe a bioinformatics approach to isolate microsatellite loci, starting from the raw sequence data through a subset of microsatellite primer pairs. The primary difference to previously published approaches includes analyses to select the most accurate sequence data and to eliminate repetitive elements prior to the design of primers. These analyses aim to minimize the testing of primer pairs by identifying the most promising microsatellite loci.

  9. Advances in Omics and Bioinformatics Tools for Systems Analyses of Plant Functions

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2011-01-01

    Omics and bioinformatics are essential to understanding the molecular systems that underlie various plant functions. Recent game-changing sequencing technologies have revitalized sequencing approaches in genomics and have produced opportunities for various emerging analytical applications. Driven by technological advances, several new omics layers such as the interactome, epigenome and hormonome have emerged. Furthermore, in several plant species, the development of omics resources has progressed to address particular biological properties of individual species. Integration of knowledge from omics-based research is an emerging issue as researchers seek to identify significance, gain biological insights and promote translational research. From these perspectives, we provide this review of the emerging aspects of plant systems research based on omics and bioinformatics analyses together with their associated resources and technological advances. PMID:22156726

  10. Role of remote sensing, geographical information system (GIS) and bioinformatics in kala-azar epidemiology

    PubMed Central

    Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep

    2011-01-01

    Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another. PMID:23554714

  11. [Research thoughts on structural components of Chinese medicine combined with bioinformatics].

    PubMed

    Wang, Cheng-cheng; Feng, Liang; Liu, Dan; Cui, Li; Tan, Xiao-bin; Jia, Xiao-bin

    2015-11-01

    Traditional Chinese medicine(TCM) is a complex system, featured with integrity and characteristics. Structural component TCM is a well-organized integrity of traditional Chinese medicine, reflecting multi-component integration effect of TCM. It gives us a new view on the material basis of TCM. Currently, conventional researching strategies are not enough to deal with the relationship between material basis and efficacy, multi-composition, multi-targets, and multi-section mechanism. Post-genome area gives a birth to bioinformatics, which involves systematic biology, different levels of omics, corresponding mathematics and computer techniques. It increasingly becomes a powerful tool to understand complicated system and life essential laws. Research ideas, methods. and knowledge of data mining technology of bioinformatics combined with the theory of structural components of Chinese medicine bring a new opportunity for developing structural components of Chinese medicine, systematically exploring the essence of TCM and promoting the modernization of TCM.

  12. A web services choreography scenario for interoperating bioinformatics applications

    PubMed Central

    de Knikker, Remko; Guo, Youjun; Li, Jin-long; Kwan, Albert KH; Yip, Kevin Y; Cheung, David W; Cheung, Kei-Hoi

    2004-01-01

    Background Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine-friendly, 3) they use a non-standard format for data input and output, 4) they do not exploit standards to define application interface and message exchange, and 5) existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow. Results To demonstrate the benefit of using web services over traditional web interfaces, we compare the two implementations of HAPI, a gene expression analysis utility developed by the University of California San Diego (UCSD) that allows visual characterization of groups or clusters of genes based on the biomedical literature. This utility takes a set of microarray spot IDs as input and outputs a hierarchy of MeSH Keywords that correlates to the input and is grouped by Medical Subject Heading (MeSH) category. While the HTML output is easy for humans to visualize, it is difficult for computer applications to interpret semantically. To facilitate the capability of machine processing, we have created a workflow of three web services that replicates the HAPI functionality. These web services use document-style messages, which means that messages are encoded in an XML-based format. We compared three approaches to the implementation of an XML-based workflow: a hard coded Java application, Collaxa BPEL Server and Taverna Workbench. The Java program functions as a web services engine and interoperates with these web

  13. CucCAP - Developing genomic resources for the cucurbit community

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The U.S. cucurbit community has initiated a USDA-SCRI funded cucurbit genomics project, CucCAP: Leveraging applied genomics to increase disease resistance in cucurbit crops. Our primary objectives are: develop genomic and bioinformatic breeding tool kits for accelerated crop improvement across the...

  14. Critical Issues in Bioinformatics and Computing

    PubMed Central

    Kesh, Someswa; Raghupathi, Wullianallur

    2004-01-01

    This article provides an overview of the field of bioinformatics and its implications for the various participants. Next-generation issues facing developers (programmers), users (molecular biologists), and the general public (patients) who would benefit from the potential applications are identified. The goal is to create awareness and debate on the opportunities (such as career paths) and the challenges such as privacy that arise. A triad model of the participants' roles and responsibilities is presented along with the identification of the challenges and possible solutions. PMID:18066389

  15. Microbial bioinformatics for food safety and production.

    PubMed

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel; van Hijum, Sacha A F T

    2016-03-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput 'omics' technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety.

  16. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    NASA Technical Reports Server (NTRS)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  17. Multiobjective optimization in bioinformatics and computational biology.

    PubMed

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  18. Translational Bioinformatics: Past, Present, and Future

    PubMed Central

    Tenenbaum, Jessica D.

    2016-01-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contextualization of the term TBI, describes the discipline’s brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field. PMID:26876718

  19. Microbial bioinformatics for food safety and production

    PubMed Central

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel

    2016-01-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput ‘omics’ technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety. PMID:26082168

  20. [Comparison of mitochondrial genomes of bivalves].

    PubMed

    SONG, Wen-Tao; GAO, Xiang-Gang; LI, Yun-Feng; LIU, Wei-Dong; LIU, Ying; HE, Chong-Bo

    2009-11-01

    The structure and organization of mitochondrial genomes of 14 marine bivalves and two freshwater bivalves were analyzed using comparative genomics and bioinformatics methods. The results showed that the organization and gene order of the mitochondrial genomes of these bivalve species studied were different from each other. The size, organization, gene numbers, and gene order of mitochondrial genomes in bivalves at different taxa were different. Phylogenetic analysis using the whole mitochondrial genomes and all the coding genes showed different results-- phylogenetic analysis conducted using the whole mitochondrial genomes was consistent with the existing classification and phylogenetic analysis conducted using all coding genes not consistent with the existing classification.

  1. Teaching the ABCs of bioinformatics: a brief introduction to the Applied Bioinformatics Course

    PubMed Central

    2014-01-01

    With the development of the Internet and the growth of online resources, bioinformatics training for wet-lab biologists became necessary as a part of their education. This article describes a one-semester course ‘Applied Bioinformatics Course’ (ABC, http://abc.cbi.pku.edu.cn/) that the author has been teaching to biological graduate students at the Peking University and the Chinese Academy of Agricultural Sciences for the past 13 years. ABC is a hands-on practical course to teach students to use online bioinformatics resources to solve biological problems related to their ongoing research projects in molecular biology. With a brief introduction to the background of the course, detailed information about the teaching strategies of the course are outlined in the ‘How to teach’ section. The contents of the course are briefly described in the ‘What to teach’ section with some real examples. The author wishes to share his teaching experiences and the online teaching materials with colleagues working in bioinformatics education both in local and international universities. PMID:24008274

  2. OpenHelix: bioinformatics education outside of a different box

    PubMed Central

    Mangan, Mary E.; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C.

    2010-01-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181

  3. Contribution of Bioinformatics prediction in microRNA-based cancer therapeutics

    PubMed Central

    Banwait, Jasjit K; Bastola, Dhundy R

    2014-01-01

    Despite enormous efforts, cancer remains one of the most lethal diseases in the world. With the advancement of high throughput technologies massive amounts of cancer data can be accessed and analyzed. Bioinformatics provides a platform to assist biologists in developing minimally invasive biomarkers to detect cancer, and in designing effective personalized therapies to treat cancer patients. Still, the early diagnosis, prognosis, and treatment of cancer are an open challenge for the research community. MicroRNAs (miRNAs) are small non-coding RNAs that serve to regulate gene expression. The discovery of deregulated miRNAs in cancer cells and tissues has led many to investigate the use of miRNAs as potential biomarkers for early detection, and as a therapeutic agent to treat cancer. Here we describe advancements in computational approaches to predict miRNAs and their targets, and discuss the role of bioinformatics in studying miRNAs in the context of human cancer. PMID:25450261

  4. Contribution of bioinformatics prediction in microRNA-based cancer therapeutics.

    PubMed

    Banwait, Jasjit K; Bastola, Dhundy R

    2015-01-01

    Despite enormous efforts, cancer remains one of the most lethal diseases in the world. With the advancement of high throughput technologies massive amounts of cancer data can be accessed and analyzed. Bioinformatics provides a platform to assist biologists in developing minimally invasive biomarkers to detect cancer, and in designing effective personalized therapies to treat cancer patients. Still, the early diagnosis, prognosis, and treatment of cancer are an open challenge for the research community. MicroRNAs (miRNAs) are small non-coding RNAs that serve to regulate gene expression. The discovery of deregulated miRNAs in cancer cells and tissues has led many to investigate the use of miRNAs as potential biomarkers for early detection, and as a therapeutic agent to treat cancer. Here we describe advancements in computational approaches to predict miRNAs and their targets, and discuss the role of bioinformatics in studying miRNAs in the context of human cancer.

  5. Tools and collaborative environments for bioinformatics research

    PubMed Central

    Giugno, Rosalba; Pulvirenti, Alfredo

    2011-01-01

    Advanced research requires intensive interaction among a multitude of actors, often possessing different expertise and usually working at a distance from each other. The field of collaborative research aims to establish suitable models and technologies to properly support these interactions. In this article, we first present the reasons for an interest of Bioinformatics in this context by also suggesting some research domains that could benefit from collaborative research. We then review the principles and some of the most relevant applications of social networking, with a special attention to networks supporting scientific collaboration, by also highlighting some critical issues, such as identification of users and standardization of formats. We then introduce some systems for collaborative document creation, including wiki systems and tools for ontology development, and review some of the most interesting biological wikis. We also review the principles of Collaborative Development Environments for software and show some examples in Bioinformatics. Finally, we present the principles and some examples of Learning Management Systems. In conclusion, we try to devise some of the goals to be achieved in the short term for the exploitation of these technologies. PMID:21984743

  6. Genome-wide screening for highly discriminative SNPs for personal identification and their assessment in world populations.

    PubMed

    Li, Liming; Wang, Yi; Yang, Shuping; Xia, Mingying; Yang, Yajun; Wang, Jiucun; Lu, Daru; Pan, Xingwei; Ma, Teng; Jiang, Pei; Yu, Ge; Zhao, Ziqin; Ping, Yuan; Zhou, Huaigu; Zhao, Xueying; Sun, Hui; Liu, Bing; Jia, Dongtao; Li, Chengtao; Hu, Rile; Lu, Hongzhou; Liu, Xiaoyang; Chen, Wenqing; Mi, Qin; Xue, Fuzhong; Su, Yongdong; Jin, Li; Li, Shilin

    2017-05-01

    The applications of DNA profiling aim to identify perpetrators, missing family members and disaster victims in forensic investigations. Single nucleotide polymorphisms (SNPs) based forensic applications are emerging rapidly with a potential to replace short tandem repeats (STRs) based panels which are now being used widely, and there is a need for a well-designed SNP panel to meet such challenge for this transition. Here we present a panel of 175 SNP markers (referred to as Fudan ID Panel or FID), selected from ∼3.6 million SNPs, for the application of personal identification. We optimized and validated FID panel using 729 Chinese individuals using a next generation sequencing (NGS) technology. We showed that the SNPs in the panel possess very high heterozygosity as well as low within- and among-continent differentiations, enabling FID panel exhibit discrimination power in both regional and worldwide populations, with the average match probabilities ranging from 4.77×10(-71) to 1.06×10(-64) across 54 world populations. With the advent of biomedical research, the SNPs connecting physical anthropological, physiological, behavioral and phenotypic traits will be eventually added to the forensic panels that will revolutionize criminal investigation.

  7. Graphics processing units in bioinformatics, computational biology and systems biology.

    PubMed

    Nobile, Marco S; Cazzaniga, Paolo; Tangherloni, Andrea; Besozzi, Daniela

    2016-07-08

    Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools.

  8. The new physician as unwitting quantum mechanic: is adapting Dirac's inference system best practice for personalized medicine, genomics, and proteomics?

    PubMed

    Robson, Barry

    2007-08-01

    What is the Best Practice for automated inference in Medical Decision Support for personalized medicine? A known system already exists as Dirac's inference system from quantum mechanics (QM) using bra-kets and bras where A and B are states, events, or measurements representing, say, clinical and biomedical rules. Dirac's system should theoretically be the universal best practice for all inference, though QM is notorious as sometimes leading to bizarre conclusions that appear not to be applicable to the macroscopic world of everyday world human experience and medical practice. It is here argued that this apparent difficulty vanishes if QM is assigned one new multiplication function @, which conserves conditionality appropriately, making QM applicable to classical inference including a quantitative form of the predicate calculus. An alternative interpretation with the same consequences is if every i = radical-1 in Dirac's QM is replaced by h, an entity distinct from 1 and i and arguably a hidden root of 1 such that h2 = 1. With that exception, this paper is thus primarily a review of the application of Dirac's system, by application of linear algebra in the complex domain to help manipulate information about associations and ontology in complicated data. Any combined bra-ket can be shown to be composed only of the sum of QM-like bra and ket weights c(), times an exponential function of Fano's mutual information measure I(A; B) about the association between A and B, that is, an association rule from data mining. With the weights and Fano measure re-expressed as expectations on finite data using Riemann's Incomplete (i.e., Generalized) Zeta Functions, actual counts of observations for real world sparse data can be readily utilized. Finally, the paper compares identical character, distinguishability of states events or measurements, correlation, mutual information, and orthogonal character, important issues in data mining

  9. An overview of bioinformatics methods for modeling biological pathways in yeast.

    PubMed

    Hou, Jie; Acharya, Lipi; Zhu, Dongxiao; Cheng, Jianlin

    2016-03-01

    The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein-protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways inS. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed.

  10. The rise of genomics.

    PubMed

    Weissenbach, Jean

    2016-01-01

    A brief history of the development of genomics is provided. Complete sequencing of genomes of uni- and multicellular organisms is based on important progress in sequencing and bioinformatics. Evolution of these methods is ongoing and has triggered an explosion in data production and analysis. Initial analyses focused on the inventory of genes encoding proteins. Completeness and quality of gene prediction remains crucial. Genome analyses profoundly modified our views on evolution, biodiversity and contributed to the detection of new functions, yet to be fully elucidated, such as those fulfilled by non-coding RNAs. Genomics has become the basis for the study of biology and provides the molecular support for a bunch of large-scale studies, the omics.

  11. Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery

    PubMed Central

    Blundell, Tom L; Sibanda, Bancinyane L; Montalvão, Rinaldo Wander; Brewerton, Suzanne; Chelliah, Vijayalakshmi; Worth, Catherine L; Harmer, Nicholas J; Davies, Owen; Burke, David

    2006-01-01

    Impressive progress in genome sequencing, protein expression and high-throughput crystallography and NMR has radically transformed the opportunities to use protein three-dimensional structures to accelerate drug discovery, but the quantity and complexity of the data have ensured a central place for informatics. Structural biology and bioinformatics have assisted in lead optimization and target identification where they have well established roles; they can now contribute to lead discovery, exploiting high-throughput methods of structure determination that provide powerful approaches to screening of fragment binding. PMID:16524830

  12. The Web as an educational tool for/in learning/teaching bioinformatics statistics.

    PubMed

    Oliver, J; Pisano, M E; Alonso, T; Roca, P

    2005-12-01

    Statistics provides essential tool in Bioinformatics to interpret the results of a database search or for the management of enormous amounts of information provided from genomics, proteomics and metabolomics. The goal of this project was the development of a software tool that would be as simple as possible to demonstrate the use of the Bioinformatics statistics. Computer Simulation Methods (CSMs) developed using Microsoft Excel were chosen for their broad range of applications, immediate and easy formula calculation, immediate testing and easy graphics representation, and of general use and acceptance by the scientific community. The result of these endeavours is a set of utilities which can be accessed from the following URL: http://gmein.uib.es/bioinformatica/statistics. When tested on students with previous coursework with traditional statistical teaching methods, the general opinion/overall consensus was that Web-based instruction had numerous advantages, but traditional methods with manual calculations were also needed for their theory and practice. Once having mastered the basic statistical formulas, Excel spreadsheets and graphics were shown to be very useful for trying many parameters in a rapid fashion without having to perform tedious calculations. CSMs will be of great importance for the formation of the students and professionals in the field of bioinformatics, and for upcoming applications of self-learning and continuous formation.

  13. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    ERIC Educational Resources Information Center

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  14. Assessment of a Bioinformatics across Life Science Curricula Initiative

    ERIC Educational Resources Information Center

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  15. Is there room for ethics within bioinformatics education?

    PubMed

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  16. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    ERIC Educational Resources Information Center

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  17. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    PubMed

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.

  18. Wrapping and interoperating bioinformatics resources using CORBA.

    PubMed

    Stevens, R; Miller, C

    2000-02-01

    Bioinformaticians seeking to provide services to working biologists are faced with the twin problems of distribution and diversity of resources. Bioinformatics databases are distributed around the world and exist in many kinds of storage forms, platforms and access paradigms. To provide adequate services to biologists, these distributed and diverse resources have to interoperate seamlessly within single applications. The Common Object Request Broker Architecture (CORBA) offers one technical solution to these problems. The key component of CORBA is its use of object orientation as an intermediate form to translate between different representations. This paper concentrates on an explanation of object orientation and how it can be used to overcome the problems of distribution and diversity by describing the interfaces between objects.

  19. Bioinformatics Resources for MicroRNA Discovery

    PubMed Central

    Moore, Alyssa C.; Winkjer, Jonathan S.; Tseng, Tsai-Tien

    2015-01-01

    Biomarker identification is often associated with the diagnosis and evaluation of various diseases. Recently, the role of microRNA (miRNA) has been implicated in the development of diseases, particularly cancer. With the advent of next-generation sequencing, the amount of data on miRNA has increased tremendously in the last decade, requiring new bioinformatics approaches for processing and storing new information. New strategies have been developed in mining these sequencing datasets to allow better understanding toward the actions of miRNAs. As a result, many databases have also been established to disseminate these findings. This review focuses on several curated databases of miRNAs and their targets from both predicted and validated sources. PMID:26819547

  20. Processing massive datasets in genomics

    NASA Astrophysics Data System (ADS)

    Artiguenave, F.

    2011-02-01

    Life science researches have been profoundly impacted by technological advances allowing faster and cheaper DNA sequencing. Opening a wide range of applications in medical and biology, the last generation sequencing platforms raised new challenges, in particular in processing, analysing and interpreting massive data. In this talk, the growing role of bioinformatics will be illustrated by providing some figures about genome sequencing and others applications aimed at unravelling biological mechanisms. Methods to gather insights from massive amount of data will be illustrated by the genome annotation process, by which genes are identified in the genome sequence.

  1. An Analysis of Adenovirus Genomes Using Whole Genome Software Tools

    PubMed Central

    Mahadevan, Padmanabhan

    2016-01-01

    The evolution of sequencing technology has lead to an enormous increase in the number of genomes that have been sequenced. This is especially true in the field of virus genomics. In order to extract meaningful biological information from these genomes, whole genome data mining software tools must be utilized. Hundreds of tools have been developed to analyze biological sequence data. However, only some of these tools are user-friendly to biologists. Several of these tools that have been successfully used to analyze adenovirus genomes are described here. These include Artemis, EMBOSS, pDRAW, zPicture, CoreGenes, GeneOrder, and PipMaker. These tools provide functionalities such as visualization, restriction enzyme analysis, alignment, and proteome comparisons that are extremely useful in the bioinformatics analysis of adenovirus genomes. PMID:28293072

  2. Ketones and lactate increase cancer cell "stemness," driving recurrence, metastasis and poor clinical outcome in breast cancer: achieving personalized medicine via Metabolo-Genomics.

    PubMed

    Martinez-Outschoorn, Ubaldo E; Prisco, Marco; Ertel, Adam; Tsirigos, Aristotelis; Lin, Zhao; Pavlides, Stephanos; Wang, Chengwang; Flomenberg, Neal; Knudsen, Erik S; Howell, Anthony; Pestell, Richard G; Sotgia, Federica; Lisanti, Michael P

    2011-04-15

    Previously, we showed that high-energy metabolites (lactate and ketones) "fuel" tumor growth and experimental metastasis in an in vivo xenograft model, most likely by driving oxidative mitochondrial metabolism in breast cancer cells. To mechanistically understand how these metabolites affect tumor cell behavior, here we used genome-wide transcriptional profiling. Briefly, human breast cancer cells (MCF7) were cultured with lactate or ketones, and then subjected to transcriptional analysis (exon-array). Interestingly, our results show that treatment with these high-energy metabolites increases the transcriptional expression of gene profiles normally associated with "stemness," including genes upregulated in embryonic stem (ES) cells. Similarly, we observe that lactate and ketones promote the growth of bonafide ES cells, providing functional validation. The lactate- and ketone-induced "gene signatures" were able to predict poor clinical outcome (including recurrence and metastasis) in a cohort of human breast cancer patients. Taken together, our results are consistent with the idea that lactate and ketone utilization in cancer cells promotes the "cancer stem cell" phenotype, resulting in significant decreases in patient survival. One possible mechanism by which these high-energy metabolites might induce stemness is by increasing the pool of Acetyl-CoA, leading to increased histone acetylation, and elevated gene expression. Thus, our results mechanistically imply that clinical outcome in breast cancer could simply be determined by epigenetics and energy metabolism, rather than by the accumulation of specific "classical" gene mutations. We also suggest that high-risk cancer patients (identified by the lactate/ketone gene signatures) could be treated with new therapeutics that target oxidative mitochondrial metabolism, such as the anti-oxidant and "mitochondrial poison" metformin. Finally, we propose that this new approach to personalized cancer medicine be termed

  3. Personalized ophthalmology

    PubMed Central

    Porter, LF; Black, GCM

    2014-01-01

    Porter L.F., Black G.C.M. Personalized ophthalmology. Clin Genet 2014: 86: 1–11. © 2014 The Authors. Clinical Genetics published by John Wiley & Sons A/S. Published by John Wiley & Sons Ltd., 2014 Ophthalmology has been an early adopter of personalized medicine. Drawing on genomic advances to improve molecular diagnosis, such as next-generation sequencing, and basic and translational research to develop novel therapies, application of genetic technologies in ophthalmology now heralds development of gene replacement therapies for some inherited monogenic eye diseases. It also promises to alter prediction, diagnosis and management of the complex disease age-related macular degeneration. Personalized ophthalmology is underpinned by an understanding of the molecular basis of eye disease. Two important areas of focus are required for adoption of personalized approaches: disease stratification and individualization. Disease stratification relies on phenotypic and genetic assessment leading to molecular diagnosis; individualization encompasses all aspects of patient management from optimized genetic counseling and conventional therapies to trials of novel DNA-based therapies. This review discusses the clinical implications of these twin strategies. Advantages and implications of genetic testing for patients with inherited eye diseases, choice of molecular diagnostic modality, drivers for adoption of personalized ophthalmology, service planning implications, ethical considerations and future challenges are considered. Indeed, whilst many difficulties remain, personalized ophthalmology truly has the potential to revolutionize the specialty. PMID:24665880

  4. Proteogenomics: Key Driver for Clinical Discovery and Personalized Medicine.

    PubMed

    Barbieri, Ruggero; Guryev, Victor; Brandsma, Corry-Anke; Suits, Frank; Bischoff, Rainer; Horvatovich, Peter

    Proteogenomics is a multi-omics research field that has the aim to efficiently integrate genomics, transcriptomics and proteomics. With this approach it is possible to identify new patient-specific proteoforms that may have implications in disease development, specifically in cancer. Understanding the impact of a large number of mutations detected at the genomics level is needed to assess the effects at the proteome level. Proteogenomics data integration would help in identifying molecular changes that are persistent across multiple molecular layers and enable better interpretation of molecular mechanisms of disease, such as the causal relationship between single nucleotide polymorphisms (SNPs) and the expression of transcripts and translation of proteins compared to mainstream proteomics approaches. Identifying patient-specific protein forms and getting a better picture of molecular mechanisms of disease opens the avenue for precision and personalized medicine. Proteogenomics is, however, a challenging interdisciplinary science that requires the understanding of sample preparation, data acquisition and processing for genomics, transcriptomics and proteomics. This chapter aims to guide the reader through the technology and bioinformatics aspects of these multi-omics approaches, illustrated with proteogenomics applications having clinical or biological relevance.

  5. Bioinformatics for precision medicine in oncology: principles and application to the SHIVA clinical trial

    PubMed Central

    Servant, Nicolas; Roméjon, Julien; Gestraud, Pierre; La Rosa, Philippe; Lucotte, Georges; Lair, Séverine; Bernard, Virginie; Zeitouni, Bruno; Coffin, Fanny; Jules-Clément, Gérôme; Yvon, Florent; Lermine, Alban; Poullet, Patrick; Liva, Stéphane; Pook, Stuart; Popova, Tatiana; Barette, Camille; Prud’homme, François; Dick, Jean-Gabriel; Kamal, Maud; Le Tourneau, Christophe; Barillot, Emmanuel; Hupé, Philippe

    2014-01-01

    Precision medicine (PM) requires the delivery of individually adapted medical care based on the genetic characteristics of each patient and his/her tumor. The last decade witnessed the development of high-throughput technologies such as microarrays and next-generation sequencing which paved the way to PM in the field of oncology. While the cost of these technologies decreases, we are facing an exponential increase in the amount of data produced. Our ability to use this information in daily practice relies strongly on the availability of an efficient bioinformatics system that assists in the translation of knowledge from the bench towards molecular targeting and diagnosis. Clinical trials and routine diagnoses constitute different approaches, both requiring a strong bioinformatics environment capable of (i) warranting the integration and the traceability of data, (ii) ensuring the correct processing and analyses of genomic data, and (iii) applying well-defined and reproducible procedures for workflow management and decision-making. To address the issues, a seamless information system was developed at Institut Curie which facilitates the data integration and tracks in real-time the processing of individual samples. Moreover, computational pipelines were developed to identify reliably genomic alterations and mutations from the molecular profiles of each patient. After a rigorous quality control, a meaningful report is delivered to the clinicians and biologists for the therapeutic decision. The complete bioinformatics environment and the key points of its implementation are presented in the context of the SHIVA clinical trial, a multicentric randomized phase II trial comparing targeted therapy based on tumor molecular profiling versus conventional therapy in patients with refractory cancer. The numerous challenges faced in practice during the setting up and the conduct of this trial are discussed as an illustration of PM application. PMID:24910641

  6. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    PubMed

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  7. Towards allele-level human leucocyte antigens genotyping - assessing two next-generation sequencing platforms: Ion Torrent Personal Genome Machine and Illumina MiSeq.

    PubMed

    Duke, J L; Lind, C; Mackiewicz, K; Ferriola, D; Papazoglou, A; Derbeneva, O; Wallace, D; Monos, D S

    2015-10-01

    Human leucocyte antigens (HLA) typing has been a challenge due to extreme polymorphism of the HLA genes and limitations of the current technologies and protocols used for their characterization. Recently, next-generation sequencing techniques have been shown to be a well-suited technology for the complete characterization of the HLA genes. However, a comprehensive assessment of the different platforms for HLA typing, describing the limitations and advantages of each of them, has not been presented. We have compared the Ion Torrent Personal Genome Machine (PGM) and Illumina MiSeq, currently the two most frequently used platforms for diagnostic applications, for a number of metrics including total output, quality score per position across the reads and error rates after alignment which can all affect the accuracy of HLA genotyping. For this purpose, we have used one homozygous and three heterozygous well-characterized samples, at HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1. The total output of bases produced by the MiSeq was higher, and they have higher quality scores and a lower overall error rate than the PGM. The MiSeq also has a higher fidelity when sequencing through homopolymer regions up to 9 bp in length. The need to set phase between distant polymorphic sites was more readily achieved with MiSeq using paired-end sequencing of fragments that are longer than those obtained with PGM. Additionally, we have assessed the workflows of the different platforms for complexity of sample preparation, sequencer operation and turnaround time. The effects of data quality and quantity can impact the genotyping results; having an adequate amount of good quality data to analyse will be imperative for confident HLA genotyping. The overall turnaround time can be very comparable between the two platforms; however, the complexity of sample preparation is higher with PGM, while the actual sequencing time is longer with MiSeq.

  8. Evaluation of the Ion Torrent Personal Genome Machine for Gene-Targeted Studies Using Amplicons of the Nitrogenase Gene nifH

    PubMed Central

    Zhang, Bangzhou; Penton, C. Ryan; Xue, Chao; Wang, Qiong

    2015-01-01

    The sequencing chips and kits of the Ion Torrent Personal Genome Machine (PGM), which employs semiconductor technology to measure pH changes in polymerization events, have recently been upgraded. The quality of PGM sequences has not been reassessed, and results have not been compared in the context of a gene-targeted microbial ecology study. To address this, we compared sequence profiles across available PGM chips and chemistries and with 454 pyrosequencing data by determining error types and rates and diazotrophic community structures. The PGM was then used to assess differences in nifH-harboring bacterial community structure among four corn-based cropping systems. Using our suggested filters from mock community analyses, the overall error rates were 0.62, 0.36, and 0.39% per base for chips 318 and 314 with the 400-bp kit and chip 318 with the Hi-Q chemistry, respectively. Compared with the 400-bp kit, the Hi-Q kit reduced indel rates by 28 to 59% and produced one to seven times more reads acceptable for downstream analyses. The PGM produced higher frameshift rates than pyrosequencing that were corrected by the RDP FrameBot tool. Significant differences among platforms were identified, although the diversity indices and overall site-based conclusions remained similar. For the cropping system analyses, a total of 6,182 unique NifH operational taxonomic units at 5% amino acid dissimilarity were obtained. The current crop type, as well as the crop rotation history, significantly influenced the composition of the soil diazotrophic community detected. PMID:25911484

  9. ADN-Viewer: a 3D approach for bioinformatic analyses of large DNA sequences.

    PubMed

    Hérisson, Joan; Ferey, Nicolas; Gros, Pierre-Emmanuel; Gherbi, Rachid

    2007-01-20

    Most of biologists work on textual DNA sequences that are limited to the linear representation of DNA. In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of naked DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADNViewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration. In addition, one bioinformatics study scenario is proposed.

  10. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  11. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    PubMed

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.

  12. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    PubMed

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services.

  13. Grouping and identification of sequence tags (GRIST): bioinformatics tools for the NEIBank database.

    PubMed

    Wistow, Graeme; Bernstein, Steven L; Touchman, Jeffrey W; Bouffard, Gerald; Wyatt, M Keith; Peterson, Katherine; Behal, Amita; Gao, James; Buchoff, Patee; Smith, Don

    2002-06-15

    NEIBank is a project to develop and organize genomics and bioinformatics resources for the eye. As part of this effort, tools have been developed for bioinformatics analysis and web based display of data from expressed sequence tag (EST) analyses. EST sequences are identified and formed into groups or clusters representing related transcripts from the same gene. This is carried out by a rules-based procedure called GRIST (GRouping and Identification of Sequence Tags) that uses sequence match parameters derived from BLAST programs. Linked procedures are used to eliminate non-mRNA contaminants. All data are assembled in a relational database and assembled for display as web pages with annotations and links to other informatics resources. Genome projects generate huge amounts of data that need to be classified and organized to become easily accessible to the research community. GRIST provides a useful tool for assembling and displaying the results of EST analyses. The NEIBank web site contains a growing set of pages cataloging the known transcriptional repertoire of eye tissues, derived from new NEIBank cDNA libraries and from eye-related data deposited in the dbEST section of GenBank.

  14. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics.

  15. Thriving in multidisciplinary research: advice for new bioinformatics students.

    PubMed

    Auerbach, Raymond K

    2012-09-01

    The sciences have seen a large increase in demand for students in bioinformatics and multidisciplinary fields in general. Many new educational programs have been created to satisfy this demand, but navigating these programs requires a non-traditional outlook and emphasizes working in teams of individuals with distinct yet complementary skill sets. Written from the perspective of a current bioinformatics student, this article seeks to offer advice to prospective and current students in bioinformatics regarding what to expect in their educational program, how multidisciplinary fields differ from more traditional paths, and decisions that they will face on the road to becoming successful, productive bioinformaticists.

  16. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    PubMed Central

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  17. Development of Bioinformatic and Experimental Technologies for Identification of Prokaryotic Regulatory Networks

    SciTech Connect

    Lawrence, Charles E; McCue, Lee Ann

    2008-07-31

    The transcription regulatory network is arguably the most important foundation of cellular function, since it exerts the most fundamental control over the abundance of virtually all of a cell’s functional macromolecules. The two major components of a prokaryotic cell’s transcription regulation network are the transcription factors (TFs) and the transcription factor binding sites (TFBS); these components are connected by the binding of TFs to their cognate TFBS under appropriate environmental conditions. Comparative genomics has proven to be a powerful bioinformatics method with which to study transcription regulation on a genome-wide level. We have further extended comparative genomics technologies that we introduced over the last several years. Specifically, we developed and applied statistical approaches to analysis of correlated sequence data (i.e., sequences from closely related species). We also combined these technologies with functional genomic, proteomic and sequence data from multiple species, and developed computational technologies that provide inferences on the regulatory network connections, identifying the cognate transcription factor for predicted regulatory sites. Arguably the most important contribution of this work emerged in the course of the project. Specifically, the development of novel procedures of estimation and prediction in discrete high-D settings has broad implications for biology, genomics and well beyond. We showed that these procedures enjoy advantages over existing technologies in the identification of TBFS. These efforts are aimed toward identifying a cell’s complete transcription regulatory network and underlying molecular mechanisms.

  18. Bioinformatic analysis of phage AB3, a phiKMV-like virus infecting Acinetobacter baumannii.

    PubMed

    Zhang, J; Liu, X; Li, X-J

    2015-01-16

    The phages of Acinetobacter baumannii has drawn increasing attention because of the multi-drug resistance of A. baumanni. The aim of this study was to sequence Acinetobacter baumannii phage AB3 and conduct bioinformatic analysis to lay a foundation for genome remodeling and phage therapy. We isolated and sequenced A. baumannii phage AB3 and attempted to annotate and analyze its genome. The results showed that the genome is a double-stranded DNA with a total length of 31,185 base pairs (bp) and 97 open reading frames greater than 100 bp. The genome includes 28 predicted genes, of which 24 are homologous to phage AB1. The entire coding sequence is located on the negative strand, representing 90.8% of the total length. The G+C mol% was 39.18%, without areas of high G+C content over 200 bp in length. No GC island, tRNA gene, or repeated sequence was identified. Gene lengths were 120-3099 bp, with an average of 1011 bp. Six genes were found to be greater than 2000 bp in length. Genomic alignment and phylogenetic analysis of the RNA polymerase gene showed that similar to phage AB1, phage AB3 is a phiKMV-like virus in the T7 phage family.

  19. Bioinformatics and molecular biology for the quantification of closely related bacteria.

    PubMed

    Nagarajan, Karthiga; Loh, Kai-Chee; Swarup, Sanjay

    2013-07-01

    Molecular biological methods for mixed culture analysis outshine conventional culture-based techniques in terms of better sensitivity and reliability. The majority of these methods exploit the 16S rRNA sequences of the community DNA, which often fall short for the analysis of closely related microorganisms. This research details the development and validation of a comprehensive methodology to differentiate and quantitatively characterize two Pseudomonas species in a mixed culture. A bioinformatics tool based on whole-genome polymorphism comparison was used to identify marker sequences to differentiate the two bacteria using quantitative real-time PCR. The quantification of the two species was achieved through a correlation of the genomic DNA versus cell number (genomic DNA purification) and threshold cycle number versus genomic DNA (real-time PCR). Several factors including the limitation of genomic DNA purification, effects of substrate concentrations and growth phase on cellular DNA, and choice of simplex or duplex reaction for real-time PCR were considered and evaluated. The developed method was experimentally validated against synthetically constructed consortia.

  20. Bioinformatic Approaches to Metabolic Pathways Analysis

    PubMed Central

    Maudsley, Stuart; Chadwick, Wayne; Wang, Liyun; Zhou, Yu; Martin, Bronwen; Park, Sung-Soo

    2015-01-01

    The growth and development in the last decade of accurate and reliable mass data collection techniques has greatly enhanced our comprehension of cell signaling networks and pathways. At the same time however, these technological advances have also increased the difficulty of satisfactorily analyzing and interpreting these ever-expanding datasets. At the present time, multiple diverse scientific communities including molecular biological, genetic, proteomic, bioinformatic, and cell biological, are converging upon a common endpoint, that is, the measurement, interpretation, and potential prediction of signal transduction cascade activity from mass datasets. Our ever increasing appreciation of the complexity of cellular or receptor signaling output and the structural coordination of intracellular signaling cascades has to some extent necessitated the generation of a new branch of informatics that more closely associates functional signaling effects to biological actions and even whole-animal phenotypes. The ability to untangle and hopefully generate theoretical models of signal transduction information flow from transmembrane receptor systems to physiological and pharmacological actions may be one of the greatest advances in cell signaling science. In this overview, we shall attempt to assist the navigation into this new field of cell signaling and highlight several methodologies and technologies to appreciate this exciting new age of signal transduction. PMID:21870222

  1. Bioinformatic tools for microRNA dissection

    PubMed Central

    Akhtar, Most Mauluda; Micolucci, Luigina; Islam, Md Soriful; Olivieri, Fabiola; Procopio, Antonio Domenico

    2016-01-01

    Recently, microRNAs (miRNAs) have emerged as important elements of gene regulatory networks. MiRNAs are endogenous single-stranded non-coding RNAs (∼22-nt long) that regulate gene expression at the post-transcriptional level. Through pairing with mRNA, miRNAs can down-regulate gene expression by inhibiting translation or stimulating mRNA degradation. In some cases they can also up-regulate the expression of a target gene. MiRNAs influence a variety of cellular pathways that range from development to carcinogenesis. The involvement of miRNAs in several human diseases, particularly cancer, makes them potential diagnostic and prognostic biomarkers. Recent technological advances, especially high-throughput sequencing, have led to an exponential growth in the generation of miRNA-related data. A number of bioinformatic tools and databases have been devised to manage this growing body of data. We analyze 129 miRNA tools that are being used in diverse areas of miRNA research, to assist investigators in choosing the most appropriate tools for their needs. PMID:26578605

  2. Bioinformatics study of the mangrove actin genes

    NASA Astrophysics Data System (ADS)

    Basyuni, M.; Wasilah, M.; Sumardi

    2017-01-01

    This study describes the bioinformatics methods to analyze eight actin genes from mangrove plants on DDBJ/EMBL/GenBank as well as predicted the structure, composition, subcellular localization, similarity, and phylogenetic. The physical and chemical properties of eight mangroves showed variation among the genes. The percentage of the secondary structure of eight mangrove actin genes followed the order of a helix > random coil > extended chain structure for BgActl, KcActl, RsActl, and A. corniculatum Act. In contrast to this observation, the remaining actin genes were random coil > extended chain structure > a helix. This study, therefore, shown the prediction of secondary structure was performed for necessary structural information. The values of chloroplast or signal peptide or mitochondrial target were too small, indicated that no chloroplast or mitochondrial transit peptide or signal peptide of secretion pathway in mangrove actin genes. These results suggested the importance of understanding the diversity and functional of properties of the different amino acids in mangrove actin genes. To clarify the relationship among the mangrove actin gene, a phylogenetic tree was constructed. Three groups of mangrove actin genes were formed, the first group contains B. gymnorrhiza BgAct and R. stylosa RsActl. The second cluster which consists of 5 actin genes the largest group, and the last branch consist of one gene, B. sexagula Act. The present study, therefore, supported the previous results that plant actin genes form distinct clusters in the tree.

  3. Bioinformatic approaches to metabolic pathways analysis.

    PubMed

    Maudsley, Stuart; Chadwick, Wayne; Wang, Liyun; Zhou, Yu; Martin, Bronwen; Park, Sung-Soo

    2011-01-01

    The growth and development in the last decade of accurate and reliable mass data collection techniques has greatly enhanced our comprehension of cell signaling networks and pathways. At the same time however, these technological advances have also increased the difficulty of satisfactorily analyzing and interpreting these ever-expanding datasets. At the present time, multiple diverse scientific communities including molecular biological, genetic, proteomic, bioinformatic, and cell biological, are converging upon a common endpoint, that is, the measurement, interpretation, and potential prediction of signal transduction cascade activity from mass datasets. Our ever increasing appreciation of the complexity of cellular or receptor signaling output and the structural coordination of intracellular signaling cascades has to some extent necessitated the generation of a new branch of informatics that more closely associates functional signaling effects to biological actions and even whole-animal phenotypes. The ability to untangle and hopefully generate theoretical models of signal transduction information flow from transmembrane receptor systems to physiological and pharmacological actions may be one of the greatest advances in cell signaling science. In this overview, we shall attempt to assist the navigation into this new field of cell signaling and highlight several methodologies and technologies to appreciate this exciting new age of signal transduction.

  4. Legal issues for chem-bioinformatics models.

    PubMed

    Duardo-Sanchez, Aliuska; Gonzalez-Diaz, Humberto

    2013-01-01

    Chem-Bioinformatic models connect the chemical structure of drugs and/or targets (protein, gen, RNA, microorganism, tissue, disease...) with drug biological activity over this target. On the other hand, a systematic judicial framework is needed to provide appropriate and relevant guidance for addressing various computing techniques as applied to scientific research in biosciences frontiers. This article reviews both: the use of the predictions made with models for regulatory purposes and how to protect (in legal terms) the models of molecular systems per se, and the software used to seek them. First we review: i) models as a tool for regulatory purposes, ii) Organizations Involved with Validation of models, iii) Regulatory Guidelines and Documents for models, iv) Models for Human Health and Environmental Endpoint, and v) Difficulties to Validation of models, and other issues. Next, we focused on the legal protection of models and software; including: a short summary of topics, and methods for legal protection of computer software. We close the review with a section that treats the taxes in software use.

  5. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care.

    PubMed

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K; Chung, Yeun-Jun

    2013-12-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population.

  6. Perspectives on Clinical Informatics: Integrating Large-Scale Clinical, Genomic, and Health Information for Clinical Care

    PubMed Central

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K.

    2013-01-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population. PMID:24465229

  7. Teaching Structural Bioinformatics at the Undergraduate Level

    ERIC Educational Resources Information Center

    Centeno, Nuria B.; Villa-Freixa, Jordi; Oliva, Baldomero

    2003-01-01

    Understanding the basic principles of structural biology is becoming a major subject of study in most undergraduate level programs in biology. In the genomic and proteomic age, it is becoming indispensable for biology students to master concepts related to the sequence and structure of proteins in order to develop skills that may be useful in a…

  8. Personalized ophthalmology.

    PubMed

    Porter, L F; Black, G C M

    2014-07-01

    Ophthalmology has been an early adopter of personalized medicine. Drawing on genomic advances to improve molecular diagnosis, such as next-generation sequencing, and basic and translational research to develop novel therapies, application of genetic technologies in ophthalmology now heralds development of gene replacement therapies for some inherited monogenic eye diseases. It also promises to alter prediction, diagnosis and management of the complex disease age-related macular degeneration. Personalized ophthalmology is underpinned by an understanding of the molecular basis of eye disease. Two important areas of focus are required for adoption of personalized approaches: disease stratification and individualization. Disease stratification relies on phenotypic and genetic assessment leading to molecular diagnosis; individualization encompasses all aspects of patient management from optimized genetic counseling and conventional therapies to trials of novel DNA-based therapies. This review discusses the clinical implications of these twin strategies. Advantages and implications of genetic testing for patients with inherited eye diseases, choice of molecular diagnostic modality, drivers for adoption of personalized ophthalmology, service planning implications, ethical considerations and future challenges are considered. Indeed, whilst many difficulties remain, personalized ophthalmology truly has the potential to revolutionize the specialty.

  9. Bridging the Gap from Bench to Bedside--An Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED).

    PubMed

    2015-01-01

    The abundance of heterogeneous biomedical data from a variety of sources demands the development of strategies to address data integration and management issues, so that the data can be used effectively in clinical practices and biomedical research. This research presents an Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED) and provides a roadmap that envisions utilizing the clinical and biomedical resources in our case study. This work describes a data integration approach, proposed by ICGED, with a two-fold purpose: personalized medicine and biomedical data storage and sharing platform. It describes our experiences integrating disease specific clinical and genomics datasets with Data Integration and Analysis Tools (DIAT)--using Informatics for Integrating Biology and the Bedside, and discusses work in progress and future work for extending DIAT, and the development of Risk Assessment and Prediction Tools, Clinical Decision Support Systems and a Bioinformatics Data Warehouse.

  10. Partnering for functional genomics research conference: Abstracts of poster presentations

    SciTech Connect

    1998-06-01

    This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.

  11. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  12. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  13. Creating Bioinformatic Workflows within the BioExtract Server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows generally require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community...

  14. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    PubMed Central

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  15. Bioinformatics opportunities for identification and study of medicinal plants

    PubMed Central

    Sharma, Vivekanand

    2013-01-01

    Plants have been used as a source of medicine since historic times and several commercially important drugs are of plant-based origin. The traditional approach towards discovery of plant-based drugs often times involves significant amount of time and expenditure. These labor-intensive approaches have struggled to keep pace with the rapid development of high-throughput technologies. In the era of high volume, high-throughput data generation across the biosciences, bioinformatics plays a crucial role. This has generally been the case in the context of drug designing and discovery. However, there has been limited attention to date to the potential application of bioinformatics approaches that can leverage plant-based knowledge. Here, we review bioinformatics studies that have contributed to medicinal plants research. In particular, we highlight areas in medicinal plant research where the application of bioinformatics methodologies may result in quicker and potentially cost-effective leads toward finding plant-based remedies. PMID:22589384

  16. Unraveling genomic variation from next generation sequencing data

    PubMed Central

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. PMID:23885890

  17. Bioconductor: open software development for computational biology and bioinformatics

    PubMed Central

    Gentleman, Robert C; Carey, Vincent J; Bates, Douglas M; Bolstad, Ben; Dettling, Marcel; Dudoit, Sandrine; Ellis, Byron; Gautier, Laurent; Ge, Yongchao; Gentry, Jeff; Hornik, Kurt; Hothorn, Torsten; Huber, Wolfgang; Iacus, Stefano; Irizarry, Rafael; Leisch, Friedrich; Li, Cheng; Maechler, Martin; Rossini, Anthony J; Sawitzki, Gunther; Smith, Colin; Smyth, Gordon; Tierney, Luke; Yang, Jean YH; Zhang, Jianhua

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples. PMID:15461798

  18. Proceedings: the Applications of Bioinformatics in Cancer Detection Workshop.

    PubMed

    Kapetanovic, Izet M; Umar, Asad; Khan, Javed

    2004-05-01

    The Division of Cancer Prevention of the National Cancer Institute sponsored and organized the Applications of Bioinformatics in Cancer Detection Workshop on August 6-7, 2002. The goal of the workshop was to evaluate the state of the science of bioinformatics and determine how it may be used to assist early cancer detection, risk identification, risk assessment, and risk reduction. This paper summarizes the proceedings of this conference and points out future directions for research.

  19. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Simple sequence repeats (SSR) or microsatellite markers are one of the most informative and versatile DNA-based markers. The use of next-generation sequencing technologies allow whole genome sequencing and make it possible to develop large numbers of SSRs through bioinformatic analysis of genome da...

  20. Biopipe: a flexible framework for protocol-based bioinformatics analysis.

    PubMed

    Hoon, Shawn; Ratnapu, Kiran Kumar; Chia, Jer-Ming; Kumarasamy, Balamurugan; Juguang, Xiao; Clamp, Michele; Stabenau, Arne; Potter, Simon; Clarke, Laura; Stupka, Elia

    2003-08-01

    We identify several challenges facing bioinformatics analysis today. Firstly, to fulfill the promise of comparative studies, bioinformatics analysis will need to accommodate different sources of data residing in a federation of databases that, in turn, come in different formats and modes of accessibility. Secondly, the tsunami of data to be handled will require robust systems that enable bioinformatics analysis to be carried out in a parallel fashion. Thirdly, the ever-evolving state of bioinformatics presents new algorithms and paradigms in conducting analysis. This means that any bioinformatics framework must be flexible and generic enough to accommodate such changes. In addition, we identify the need for introducing an explicit protocol-based approach to bioinformatics analysis that will lend rigorousness to the analysis. This makes it easier for experimentation and replication of results by external parties. Biopipe is designed in an effort to meet these goals. It aims to allow researchers to focus on protocol design. At the same time, it is designed to work over a compute farm and thus provides high-throughput performance. A common exchange format that encapsulates the entire protocol in terms of the analysis modules, parameters, and data versions has been developed to provide a powerful way in which to distribute and reproduce results. This will enable researchers to discuss and interpret the data better as the once implicit assumptions are now explicitly defined within the Biopipe framework.